hisat2 --rna-strandness option and downstream htseq-count analysis

Bioinformatics Asked by guan on April 4, 2021

I’ve got some doubts on the hisat2 –rna-strandness option and its output for downstream analysis. Please see below.

I understand that the –rna-strandness option produces an XS tag to indicate where a transcript is from (on the + or – strand) for downstream transcriptome assembly analysis. I have a paired-end stranded sequencing library that was aligned to the genome using hisat2 without specifying the –rna-strandness (in other words, the default unstranded was the usage). Following this, the reads were assigned to genes using htseq-count and this time “-s reverse” was specified given the strand-specific sequencing assay type.

Would the above handling affect the alignment and counting results given the default usage of –rna-strandness in hisat2 followed by htseq-count -s reverse on a strand-specific assay? Since –rna-strandness is for transcriptome assembly using the XS tags generated and htseq does not use XS tags for counting, I presume there should be no practical impact from the above. Could you also shed light on this? in case I may have been overlooking other facts of the usages of the tools.

To help verify the above, I re-aligned and counted the reads from 2 samples by switching on –rna-strandness RF in hisat2. I attach the alignment and count features info. below for assessment.

Overall alignment rate of Sample 1: 94.52% (–rna-strandness RF) vs.94.12% (–rna-strandness unstranded)
Overall alignment rate of Sample 2: 94.57% (–rna-strandness RF) vs.94.15% (–rna-strandness unstranded)

Feature counts of Sample 1 (following –rna-strandness RF + -s reverse):
__no_feature 6327294
__ambiguous 2954776
__too_low_aQual 3784481
__not_aligned 688856
__alignment_not_unique 4858182

Feature counts of Sample 1 (following –rna-strandness unstranded + -s reverse):
__no_feature 6291151
__ambiguous 2911298
__too_low_aQual 4075017
__not_aligned 754400
__alignment_not_unique 16136045

Feature counts of Sample 2 (following –rna-strandness RF + -s reverse):
__no_feature 5417882
__ambiguous 1708510
__too_low_aQual 3532352
__not_aligned 564596
__alignment_not_unique 2859501

Feature counts of Sample 1 (following –rna-strandness unstranded + -s reverse):
__no_feature 5359434
__ambiguous 1676091
__too_low_aQual 3813344
__not_aligned 623122
__alignment_not_unique 2891792

These results look comparable to me across pipelines.


One Answer

If you reran the command with the correct settings, just leave it at that. (It is not at all clear to me that strandedness rf is correct)

If you want people to tell you if you ran the commands right, you need to put down what commands you used.

Answered by swbarnes2 on April 4, 2021

Add your own answers!

Related Questions

Downloading SRA Files from AWS

1  Asked on December 6, 2020


NCT search in PubMed via Entrez (python)

0  Asked on December 5, 2020 by nutarelli-federico


using snakemake shadow rules to store temp files on local nodes

1  Asked on December 2, 2020 by kamil-s-jaron


Software for taxonomic assignment?

2  Asked on December 1, 2020


Detecting Allelic Imbalance

0  Asked on November 26, 2020 by krizbomb


Mapping statistics from bam file using bbtools and sambamba

2  Asked on November 25, 2020 by bioinfonext


Preprocessing Affymetrix SNP Array Matrix

0  Asked on November 17, 2020 by thanh-nguyen


Ask a Question

Get help from others!

© 2023 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir