TransWikia.com

How do I create a VCF file of all known pathogenic mutations in a gene of interest?

Bioinformatics Asked by Nereus on June 8, 2021

I would like to create a list (probably .vcf format would be good) of all known pathogenic missense mutations in a human gene of interest and then add other variants that could lead to the same pathogenic amino acid substitutions. I intend to search targeted sequencing data of my gene of interest for variants that would cause these particular substitions.

I’m very new to bioinformatics so I’m not so familiar with the appropriate databases and tools to use, but this is a rough idea of how it could be done:

  1. extract all mutations listed in Ensembl within my gene of interest that are labelled with HGMD_MUTATION (is there an option to download a vcf file with all HGMD mutations within my gene? I’ve only managed to do so for individual mutations.)
  2. annotate list with reference and mutant amino acids for every mutation in the list
  3. add additional mutations (which may not be present in HGMD) which could lead to the same substitution. E.g. I’m interested in the substitution Proline (reference codon: CCG) to Leucine (TTR or CTN) at AA position 102, so I would like to list all hypothetical mutations that would create a leucine codon in this position, not only the one given in Ensembl (which is CCG -> CTG).

Could you point me towards the best tools to use for these purposes? Unfortunately I haven’t got a HGMD subscription. Many thanks!

One Answer

I'm not sure how to generate the additional mutations, but I would say that HGMD is not the way to find all the pathogenic variants. I would probably filter this table by either Clinical significance or Evidence->Phenotype association.

Answered by Emily_Ensembl on June 8, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP