AnswerBun.com

Difference between genome assembly and genome sequence alignment to a reference to find structural variants

Bioinformatics Asked by M4r1n4 on September 4, 2021

I’m trying to determine what the difference and benefits of genome assembly and genome sequence alignments are when trying to identify structural variants or transposons in populations.
I’ve been scouring the internet but have only really come across the difference between short vs long reads and de novo assembly vs reference-based.

My understanding is that to identify variations in structural variants within a population there seem to be 2 main comparative genomic methods, the first being what the 1KGP and SDGP did and sequence the whole genome, align the reads to the reference genome and end up with a BAM file.

The second is to assemble personal genomes and then compare or align the assemblies to each other and the reference genome or using the Lastz/LiftOver/ChainNets Examples: 10.1016/j.gene.2005.09.031

Thanks in advance.

One Answer

the first being what the 1KGP and SDGP did and sequence the whole genome, align the reads to the reference genome and end up with a BAM file.

If you have a well defined reference genome (e.g. human, mouse etc.) and you are interested in population level genetic variation, then this is the main approach. If you sequence a new human genome in the classical way (i.e. short reads ~30X coverage etc.), de novo assembly is pointless generally and you can more rapidly map reads to the reference. The read mapping approach has the large advantage that when you get the to variant calling stage, you can use information about both the depth of sequencing and the base quality scores.

The second is to assemble personal genomes and then compare or align the assemblies to each other and the reference genome or using the Lastz/LiftOver/ChainNets

This is the traditional comparative genomics approach used for non-model organisms and for conducting evolutionary rate comparisons at the cross-species level. You discard information on sequencing depth and quality scores, but you can compare many species' genomes at once genome-wide. You also worry less about sequencing errors etc. because you are not looking for relatively rare population level variation, but instead more common (depending on your species) variation between species.

Answered by Chris_Rands on September 4, 2021

Add your own answers!

Related Questions

SLURM script for running RSEM star fails

0  Asked on April 24, 2021 by angelo

     

Genomic relationship matrix explanation

0  Asked on April 21, 2021

   

Fastq: how can I check if they are from DNA or RNAseq data?

2  Asked on April 15, 2021 by emma-athan

     

ATAC seq density calculation

2  Asked on April 15, 2021

 

plink: –update-name vs. editing the BIM

2  Asked on April 11, 2021 by coderguy123

   

About getting rs id from chromosome and position

0  Asked on April 10, 2021 by susuauidikd

 

How to convert a Pileup file to VCF format with Hg19 alignment

1  Asked on April 9, 2021 by samir-bouftass

 

Ask a Question

Get help from others!

© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir