AnswerBun.com

Will using smaller kmers help get larger contigs? If not, then what?

Bioinformatics Asked on April 25, 2021

I’ve been using MEGAHIT to assemble metagenomes, with particular focus on specific genomic areas.

Sometimes all I get is gene fragments or pathway fragments (eg. if I know that genes A, B, C, D and E should be together, I only get A, B, C in one contig, and maybe D and E in another). That is understandable when the sequencing depth is low, and I’m assume that it’s the best I can get.

However. In two of my metagenomes, the genes of interest seem to be abundant (depth of 100-300+ listed for MEGAHIT contigs). And I still have the same problem. I assume that these ‘breaks’ between contigs are due to natural sequence variation which breaks the assembly process. (while sequencing errors are also possible, I’ve used bbduk to clean my data prior to assembly)

I’d like to have larger contigs because I’m curious about the gene order. Things I’ve tried (did not work): visualizing assembly graphs with Bandage around Blast hits, starting with a smaller k-min (21 as opposed to 27).

So I guess I’m wondering if using even smaller k-mers might make a difference. Or if someone has another suggestion, I’m happy to listen! Thank you.

Megahit setting used:

megahit -r input_files.fastq --num-cpu-threads 32 --min-contig-len 300 --presets meta-large -o output
megahit -r input_files.fastq --num-cpu-threads 32 --min-contig-len 1500 --presets meta-sensitive -o output

(meta-large starts from minimum kmer size 27; meta-sensitive from 21)

One Answer

You can try s-aligner (free for 15 days) which usually gets quite larger contigs and quite larger NG50 for metagenomic data containing viruses (also phages). Indeed metaSPAdes also gets larger contigs, despite shorter than s-aligner. enter image description here

Ref. s-aligner: a greedy algorithm for non-greedy de novo genome assembly

Answered by juanjo75es on April 25, 2021

Add your own answers!

Related Questions

How to plot a PCOA biplot with OTU loadings as arrows

0  Asked on August 2, 2021 by gal-t

   

CIBERSORT runtime error eval failed

1  Asked on July 28, 2021 by dr_hope

   

Standard Way to Preprocess Gene Expression?

1  Asked on July 23, 2021 by wedgeantilles

 

cmapPy exception

1  Asked on July 20, 2021 by blue-bells

     

Timeout when downloading the ncbi nr blast database

2  Asked on July 19, 2021 by c-zeil

   

Viral genome finishing

1  Asked on July 17, 2021

       

Ask a Question

Get help from others!

© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir