TransWikia.com

Merge / Reconciliate several de novo transcriptome assemblies with different kmers

Bioinformatics Asked by Pitagoras Alves on August 22, 2021

I am building a De Novo transcriptome reference assembly for an eukaryotic organism for which I have a genome.

I’ve created several assemblies with rnaSpades using different kmer sizes (19 to 69 with step 10). Now I would like to merge them into one final transcriptome.

How could I do that?

Is using a genome assembly reconciliation tool such as metassembler a good idea?

2 Answers

A simple option would be to simply supply SPAdes with some contigs that you like with the --trusted-contigs option.

There exist tools specifically addressing the problem of merging transcriptome assemblies. I am unsure whether this is notably different from genome assemblies sufficiently that they are better than the genome assembly mergers. Here is a partial list:

  1. transfuse
  2. this paper
  3. this paper
  4. this paper
  5. DRAP

This paper seems to have some more information on comparing tools, though it is not focused specifically on merging.

See also the SeqAnswers on this topic, and this discussion too.

Answered by Maximilian Press on August 22, 2021

You may combine all your transcriptomes into a single file and then apply a clustering method to group very similar transcripts into a single one. For that purpose, you may try CD-HIT-EST or MMseqs2. For each identity threshold you are going to test, you may assess the final quality with BUSCO or by blasting against reference sequences.

Answered by thomas duge de bernonville on August 22, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP