SRA to FASTQ Conversion Script

This Bash script automates the process of converting SRA (Sequence Read Archive) files to FASTQ format using the fasterq-dump tool from the SRA Toolkit. It processes multiple SRA files based on a provided list of SRR IDs.

Features

Processes multiple SRA files in batch based on a list of SRR IDs
Checks for existing FASTQ files to avoid redundant processing
Creates separate output directories for each SRA accession
Uses fasterq-dump for efficient SRA to FASTQ conversion
Handles split-files output for paired-end sequencing data
Provides detailed error messages for troubleshooting

Prerequisites

Bash shell
SRA Toolkit installed with fasterq-dump accessible in the system PATH

Usage

./script_name.sh <srr_list_file> <sra_file_path> <fastq_output_path>
<srr_list_file>: A text file containing SRR IDs, one per line
<sra_file_path>: Directory containing SRA files organized in subdirectories
<fastq_output_path>: Directory where FASTQ files will be saved
Input File Format
The srr_list_file should be a plain text file with one SRR ID per line, for example:

Copy
SRR1234567
SRR2345678
SRR3456789
Directory Structure
Input SRA directory (sra_file_path) should have this structure:

sra_file_path/
├── SRR1234567/
│   └── SRR1234567.sra
├── SRR2345678/
│   └── SRR2345678.sra
└── ...
Output directory (fastq_output_path) will be structured as:


fastq_output_path/
├── SRR1234567/
│   ├── SRR1234567_1.fastq
│   └── SRR1234567_2.fastq (if paired-end)
├── SRR2345678/
│   ├── SRR2345678_1.fastq
│   └── SRR2345678_2.fastq (if paired-end)
└── ...

How it works
The script checks for the correct number of input arguments.
It reads the SRR list file line by line.
For each SRR ID:
It checks if the output directory is empty to avoid redundant processing.
It finds the corresponding SRA directory and file.
If the SRA file exists, it runs fasterq-dump to convert it to FASTQ format.
The resulting FASTQ files are saved in the corresponding output subdirectory.
The script continues until all SRR IDs in the list are processed.
Error Handling
The script checks for the correct number of arguments.
It verifies if the SRA directory and file exist for each SRR ID.
It provides informative messages for missing directories or files.
It skips processing if FASTQ files already exist for an SRR ID.
Note
Ensure that your SRA files are organized in subdirectories named after their SRR IDs within the sra_file_path directory.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
sh_fasterq-dump.sh		sh_fasterq-dump.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SRA to FASTQ Conversion Script

Features

Prerequisites

Usage

About

Uh oh!

Releases

Packages

Languages

LauYuXuan/FASTQ_processing

Folders and files

Latest commit

History

Repository files navigation

SRA to FASTQ Conversion Script

Features

Prerequisites

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages