Skip to content

Selective alignment & sketch mode

salmon has two ways to map reads to the transcriptome. Both produce the equivalence classes the optimizer needs; they trade accuracy against speed.

The default mode. salmon finds maximal exact matches (MEMs) between each read and the index, chains them with the chaining algorithm introduced in minimap21, and then computes a fast banded alignment score for each candidate placement using a native-Rust port of the ksw22 extension aligner, accepting a mapping only if its score clears a fraction of the perfect score (--minScoreFraction, default 0.65). This rejects spurious mappings that share k-mers with a transcript but do not actually align well — the key accuracy advantage of selective alignment3 over plain pseudoalignment.

Useful knobs:

  • --minScoreFraction <f> — acceptance threshold as a fraction of the perfect score.
  • --ma / --mp / --go / --ge — match score, mismatch penalty, gap-open and gap-extend penalties.
  • --allowDovetail — admit dovetailed (short-insert) fragments.
  • --orphanChainSubThresh, --postMergeChainSubThresh — chain-pruning thresholds applied before alignment (see the note below).

Alignment-free pseudoalignment4. salmon maps reads using k-mer/equivalence-class hits without computing per-placement alignment scores. This is faster and is a good fit when you want throughput and your reference is high-quality.

Terminal window
salmon quant -i salmon_index -l A --sketch \
-1 r1.fq.gz -2 r2.fq.gz -p 16 -o out

When only one mate of a pair maps, salmon can still record an orphan mapping. Sketch mode defaults to a relaxed orphan rule — orphan whenever the other mate has no consistent target — which tracks selective alignment more closely and improves agreement with the gold-standard results.

  • --sketchStrictOrphans switches to the conservative rule: only orphan a pair when the other mate had no matching k-mers at all.

--allowDovetail is honored in sketch mode as well, admitting dovetailed short-insert fragments.

  1. Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34(18), 3094–3100. https://doi.org/10.1093/bioinformatics/bty191

  2. Suzuki, H., & Kasahara, M. (2018). Introducing difference recurrence relations for faster semi-global alignment of long sequences. BMC Bioinformatics, 19(45). https://doi.org/10.1186/s12859-018-2014-8 — salmon 2.0 uses ksw2rs, a native-Rust port of ksw2.

  3. Srivastava, A., Malik, L., Sarkar, H., Zakeri, M., Almodaresi, F., Soneson, C., Love, M. I., Kingsford, C., & Patro, R. (2020). Alignment and mapping methodology influence transcript abundance estimation. Genome Biology, 21, 239. https://doi.org/10.1186/s13059-020-02151-8

  4. Bray, N. L., Pimentel, H., Melsted, P., & Pachter, L. (2016). Near-optimal probabilistic RNA-seq quantification. Nature Biotechnology, 34(5), 525–527. https://doi.org/10.1038/nbt.3519