TransWikia.com

Does rRNA depletion protocol give higher number of mapped reads in Intronic regions?

Bioinformatics Asked on May 28, 2021

Recently, I have downloaded a publicly available dataset, which are 350 tumor samples. I see the following information from the published paper.

enter image description here

They used Ribo Zero Gold and rRNA was depleted. Strand specific data. After aligning the data I did some alignment quality check with Qualimap RNA-Seq QC tool. I visualised the bam files in IGV. Alignment is good. For all samples 90% alignment rate was seen. I observed that in all samples Higher percentage of mapped reads were originating in Intronic regions. Followed by Exonic and intergenic regions.

I have seen a post here Reads mapped to exonic, intronic and intergenic regions where they say high intronic reads could be because of contamination. I googled about higher reads in intronic regions and found some papers Evaluation of two main RNA-seq approaches for gene quantification in clinical RNA sequencing: polyA+ selection versus rRNA depletion and some other links RIBO-DEPLETION IN RNA-SEQ – WHICH RIBOSOMAL RNA DEPLETION METHOD WORKS BEST? in which they said Greater intronic reads were with rRNA depletion protocol.

And even in this RNA-SEQ tutorial, it is mentioned that – A higher intronic mapping rate is expected for rRNA removal compared to polyA selection.

So, my question:

I am working with lncRNAs. So, I’m using the samples prepared with rRNA depletion protocol. Is this higher intronic rate is common in rRNA depleted dataset or do I have to check anything else to proceed further with these samples?

2 Answers

I can't possibly see how intron contamination is linked to removal of rRNA depletion.

The only reason it would appear to have increased the number of contaminants is because post-rRNA removal the proporation of intron contaminants has increased against the total remaining RNA content. However, the actual total number of contaminants remains exactly the same pre- and post-rRNA removal. By the same token the RNA of interest lncRNAs will also have proportionally increased, so you get a better depth of its predominance and diversity.

Thats just life and perhaps just filter this bioinformatically.

Answered by M__ on May 28, 2021

It's not so much that you have "intronic contamination" or "genomic contamination", rather you're not selecting explicitly for full-length mature transcripts with rRNA depletion. That is the most common cause for higher intronic read rates. There's nothing you can do about this post-hoc, just continue along.

BTW, many lncRNA's are polyadenylated, so you'll keep them with poly-A selection.

Answered by Devon Ryan on May 28, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP