AnswerBun.com

lower mapping rates in salmon v0.13 compared to previous versions

Bioinformatics Asked by Courtney Stairs on April 29, 2021

Hi there 🙂 Thanks for the tool!

I recently updated to the new salmon (from 0.8… its been a couple years) and I noticed that my mapping percentages change dramatically between the two versions.

For example, using the default settings in v0.8, I see a mapping rate of 96.25% however upon upgrading to v0.13, these mapping rates go down to 82-84% depending on which of the parameters I play with (I’ve permutated the validateMappings, incompatPrior, consensusSlack, minScoreFraction, ma, etc). Below I have a snippet of one of these tests. I’m currently assembling the unmapped reads to see if I can determine why these reads were not mapped.

What do you think is the most likely cause for this dramatic change in performance?

Perhaps it could be the transcript index? (I used a kmer of 31 and my reads are 100 bp).

UPDATE:

I have run the indexing and quasi-mapping using salmon v0.8, v0.10 (when validateMappings was released), v0.12, v0.13 and had mapping rates of 96, 90, 90 and 83 respectively. So it looks like a gradual decay of mapping rates. Or alternatively, a gradual reduction in spurious mapping?

Seems like what ever improvements from 0.12 to 0.13 changed the mapping rates significantly. I will go ahead with both mappings (v0.12 and v0.13) to see if there are differences in the DESEQ2 outputs.

Thanks a lot!
(while I have presented values here for one set of fastq files, the same pattern is seen throughout all 72 of my samples from different experiments)

for F in $FILES ; do
  R1=${F}_L001_R1_001.fastq.gz
  R2=${F}_L001_R2_001.fastq.gz
  salmon quant 
      -p 35 
      -i $ref_dir/$transcript_index 
      -l A 
      -1 $R1 
      -2 $R2 
      -o $(basename $F).VM.rFB4.IP9.quant 
      --incompatPrior 9.9999999999999995e-21   # default of v0.8
      --writeUnmappedNames 
      --validateMappings 
      --rangeFactorizationBins 4 

done

Update salmon is now released with a --allowDovetail option in 0.13.1 and inspected via a -z dump.

One Answer

[Most likely answer based on comments attached to the question]

Hi Courtney, We've looked into the data and identified the source of the different mapping rates. Specifically, the cause is discordant reads; in this case reads that are dovetailing with respect to their positions on the reference. If you run Bowtie2 with "RNA-seq" flags (i.e. disallowing discordant mappings), then you get essentially the same mapping rate as salmon here. Is allowing dovetailing reads something that should be very important in your analysis? If so, we can consider what would be necessary to add back support, but generally, they provide evidence of tenuous quality.

Answered by gringer on April 29, 2021

Add your own answers!

Related Questions

Sequence alignment using BWT

1  Asked on December 5, 2021

   

Generating 3D coordinates error

1  Asked on December 2, 2021

   

Seurat DE t.test

1  Asked on December 2, 2021

       

Extract sequences from partial Header

2  Asked on November 24, 2021

     

Viral Metagenomics

1  Asked on November 10, 2021

     

Ask a Question

Get help from others!

© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir