TransWikia.com

Download multiple fastq files using fastq-dump

Bioinformatics Asked on August 22, 2021

I want to download the following fastq files at the same time in Salmon:

 - SRR10611214
 - SRR10611215
 - SRR10611215
 - SRR10611216
 - SRR10611217

Is there a way to do this using a bash for loop or fastq-dump? or prefetch

3 Answers

Make a list.txt file containing a single column of SRA numbers to download.

then:

for i in $(cat list.txt); do echo $i; date; fasterq-dump -S $i; done

It works well to use NCBI's web interface to find SRA samples of interest, download and open findings in Excel, then copy single column containing SRA numbers and paste into list.txt using document editor such as vim.

After downloading including the "R" can be nice:

for i in *_1.fastq; do mv $i ${i%_1.fastq}_R1.fastq; done

for i in *_2.fastq; do mv $i ${i%_2.fastq}_R1.fastq; done

and zip:

pigz *fastq

If needed a conda option for downloading fasterq-dump:

conda install -c bioconda sra-tools

Answered by Stuber on August 22, 2021

You can use parallel.

parallel -j 3 fastq-dump {} ::: SRR10611214 SRR10611215 SRR10611215 SRR10611216 SRR10611217

The option -j says how many jobs should maximal run parallel. So in this case maximal 3 identifier would be handled at the same time.

How many jobs you can run parallel depends on your machine.

You can also take a look at parallel-fastq-dump.

Answered by Mr_Z on August 22, 2021

A sample code is given in the salmon documentation as follows. Source

#!/bin/bash
mkdir data
cd data
for i in `seq 25 40`; 
do 
  mkdir DRR0161${i}; 
  cd DRR0161${i}; 
  wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/DRR016/DRR0161${i}/DRR0161${i}_1.fastq.gz; 
  wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/DRR016/DRR0161${i}/DRR0161${i}_2.fastq.gz; 
  cd ..; 
done
cd .. 

This could be modified as follows.

#!/bin/bash
mkdir data
cd data
for i in `seq 14 17`; 
do 
  wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR106/0${i}/SRR106112${i}/SRR106112${i}_1.fastq.gz; 
  wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR106/0${i}/SRR106112${i}/SRR106112${i}_2.fastq.gz; 
done
cd .. 

You can save the code as a shell script and run it from the linux terminal. For example bash download_fastq.sh

Answered by Balan on August 22, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP