TransWikia.com

Comparing groups of 16S rRNA sequences - how to?

Bioinformatics Asked on December 5, 2020

I have several groups of 16S rRNA sequences associated with taxonomic groups (let’s call them A, B and C). The sequences within each group have a common ancestor and are on average more closely related to sequences within the same group. The groups are also related, but less.

Ideally I’d like to have a table like this (where percentages represent the average similarity):

           Group A     Group B     Group C
Group A      99%                  
Group B      97%         98%              
Group C      94%         95%         99%

Is there software that can calculate these percentages for me? I’m thinking that I could set up a many-against-many BLAST search and then try to use pandas in python in order to parse the resulting table…But if something like this already exists, I’d rather not reinvent the wheel. Thank you!

One Answer

If you want to make an all vs all with blast this can be helpful. You can create a blast database with your sequences and then search against itself.

makeblastdb -in 16S_sequences.fasta -dbtype nucl -out my_16S_sequences_db

blastn -db my_16S_sequences_db -query 16S_sequences.fasta -outfmt 6 
       -out allvsall_16S.tsv -num_threads "$(nproc)"

If you want a percent identity matrix for a pairwise alignment you can try with this approach: https://www.biostars.org/p/220154/

Correct answer by zorbax on December 5, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP