AnswerBun.com

Obtaining HGDP project data in fasta format

Bioinformatics Asked on September 26, 2021

I need to obtain sample data from modern humans in fasta format. I just need some megabytes of data from every individual. I actually use a script that obtains the cram file from here (ftp.1000genomes.ebi.ac.uk) and then processes it to obtain the fasta file.
The problem is that cram files are large, slow to download and slow to process. It takes days to get the samples.
Is there a better way to get these samples in fasta format?

The script already makes use of samtools to retrieve only the part of the bam file it needs but doesn’t help much. Cram files are still gigabytes large for only a few megabytes of data that I need.

I have the same problem with data from the 1000 genomes project.

One Answer

You can download HGDP data in FASTQ format here: https://www.internationalgenome.org/data-portal/data-collection/hgdp

Correct answer by Dan Bolser on September 26, 2021

Add your own answers!

Related Questions

Genome scaffolding

3  Asked on December 22, 2020

         

Time for running ADMIXTURE analysis

1  Asked on December 17, 2020 by iriel

     

Output the linear predictor from a stratified cox model?

1  Asked on December 17, 2020 by nienke

 

FastQC and Trimmomatic in Galaxy?

0  Asked on December 16, 2020

     

What type of Protein ID is this?

1  Asked on December 14, 2020 by firingam

   

How to find all WGS assemblies accessions of a species

1  Asked on December 12, 2020 by oren-milman

     

plot transcript expression vs length

0  Asked on December 10, 2020 by user3377241

       

Filtering pileup from site lists

0  Asked on December 8, 2020 by eliran-turgeman

       

How to modify DNA evolution model to fit actual data?

2  Asked on December 8, 2020 by anthony-guterres

   

Ask a Question

Get help from others!

© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir