Interpretion of my coronavirus 2019-nCov, Wuhan, China BLAST tree?

Bioinformatics Asked on December 27, 2021

This is the BLAST tree of the latest coronavirus out of China (from Wuhan Institute of Virology, China). It seems strange that there is so much divergence from all the other coronaviruses. Is this expected of new diseases?

Wuhan BLAST cornoavirus tree

One Answer

The evolutionary related group (clade) of betacoronaviruses you have identified share an amino acid homology of 85% and include SARS. I know this from the underlying tree published on BioRxiv of a broader group of betacoronaviruses, i.e. your data is a defined subset of the betacoronaviruses which all share a unique, single common ancester.

Lets call this group the SARS clade.

You have performed a nucleotide blast and asked NCBI to produce a NJ tree using 2019-nCov as the reference. I can tell this by the colour coding and the scale bar at the bottom right hand corner shows the genetic distance involved is reasonably larger than 15% divergence by amino acid data. The scale bar represents the number of mutations per nucleotide.

In summary, your tree is essentially a subset of the broader genetic diversity of the one published, but there is a rooting issue.

In your tree, the majority of sequences are from the 2002 SARS epidemic and the virtually zero genetic distance between them is simply because it is a rapidly transmitted outbreak. I didn't realize that SARS has two independent origins both initially from bats. This is quite scary.

  • 2019-nCov is an outgroup within the 'SARS clade' hence it appears at the other side of the tree, i.e. they share a more distant common ancestor
  • However 2019-nCov is not the most distant common ancester within this subset of the betacoronaviruses, i.e. SARS clade, this belongs to the two bat viruses ZC45 and ZX21. The program has made a likely rooting error (below).
  • Again the blast omitted the majority of the betacoronaviruses, for example MERS

Rooting issue the reason I suggest there is a rooting error is because the BioRxiv tree using a broad sample of the betacoronaviruses placed the bat strains ZC45 and ZX21 as outgroups to the SARS clade and 2019-nCOV was immediately within that, so in this definition 2019-nCOV is an 'ingroup' within the SARS clade, whereas your tree it is an 'outgroup'. It not a huge issue, but the location of the "root" is determined by the common ancester above it (sequences with <85% homology) and in your tree those have been omitted.

Generally, I like the analysis otherwise and gives an insight into SARS that I hadn't previosly been aware of.

Answered by M__ on December 27, 2021

Add your own answers!

Related Questions

How to plot a PCOA biplot with OTU loadings as arrows

0  Asked on August 2, 2021 by gal-t


CIBERSORT runtime error eval failed

1  Asked on July 28, 2021 by dr_hope


Standard Way to Preprocess Gene Expression?

1  Asked on July 23, 2021 by wedgeantilles


cmapPy exception

1  Asked on July 20, 2021 by blue-bells


Timeout when downloading the ncbi nr blast database

2  Asked on July 19, 2021 by c-zeil


Viral genome finishing

1  Asked on July 17, 2021


Ask a Question

Get help from others!

© 2023 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir