Skip to content

🌶️ sracha 🌶️

Bioconda

Fast SRA downloader and FASTQ converter, written in pure Rust.

Features

  • Parallel downloads -- chunked HTTP Range requests with multiple connections
  • Native VDB parsing -- pure Rust, zero C dependencies
  • Integrated pipeline -- download, convert, and compress in one command
  • cSRA decode -- reference-compressed / aligned SRA archives decoded in pure Rust
  • ENA fast path -- --prefer-ena pulls pre-computed FASTQ.gz from ENA when available
  • Project-level accessions -- pass a BioProject (PRJNA) or study (SRP) to download all runs
  • Accession lists -- batch download from a file with --accession-list
  • Parallel gzip or zstd -- pigz-style block compression via rayon
  • FASTA output -- drop quality scores with --fasta
  • SRA and SRA-lite -- full quality or simplified quality scores
  • Split modes -- split-3, split-files, split-spot, interleaved
  • Stdout streaming -- pipe interleaved FASTQ to downstream tools with -Z
  • Resumable downloads -- automatically resumes interrupted transfers
  • File validation -- verify SRA file integrity with sracha validate
  • VDB introspection -- sracha vdb inspects local .sra files (tables, columns, metadata, schema) as a pure-Rust replacement for vdb-dump

How it works

sracha get runs the full pipeline in one command:

  1. Resolve -- looks up the accession via direct S3 URL (with SDL API fallback)
  2. Download -- fetches the .sra file with parallel chunked HTTP Range requests
  3. Parse -- reads the KAR archive and decodes VDB columns (READ, QUALITY, READ_LEN, NAME)
  4. Output -- formats FASTQ (or FASTA) records and compresses with parallel gzip/zstd

Demo

sracha get

Quick start

# Download, convert, and compress in one shot
sracha get SRR28588231

# Download all runs from a BioProject
sracha get PRJNA675068

# Batch download from an accession list
sracha get --accession-list SRR_Acc_List.txt

# Just download
sracha fetch SRR28588231

# Convert a local .sra file
sracha fastq SRR28588231.sra

# Show accession info
sracha info SRR28588231

# Validate a downloaded file
sracha validate SRR28588231.sra

See the Getting Started guide for more examples, or the CLI Reference for all options.

Installation

From binary releases

Download pre-built binaries from the releases page.

From source

Requires Rust 1.92+.

cargo install --git https://github.com/rnabioco/sracha-rs sracha

With Bioconda

pixi add bioconda::sracha

With containers

Because sracha is on Bioconda, BioContainers automatically publishes a Docker/Singularity image for every release — no local build required.

1
2
3
4
5
6
# Docker / Podman
docker run --rm quay.io/biocontainers/sracha:0.3.7--h54198d6_0 sracha --help

# Singularity / Apptainer
singularity run \
  https://depot.galaxyproject.org/singularity/sracha:0.3.7--h54198d6_0 sracha --help

The tags above are examples — check quay.io for the latest <version>--<build> tag and substitute it in.

In Nextflow, point a process at the image directly or let the conda directive resolve it:

1
2
3
4
5
process SRACHA_GET {
    container 'quay.io/biocontainers/sracha:0.3.7--h54198d6_0'
    // or: conda 'bioconda::sracha=0.3.7'
    // ...
}

Acknowledgments

sracha builds on the Sequence Read Archive, maintained by the National Center for Biotechnology Information at the National Library of Medicine. The SRA and its toolchain are public-domain software developed by U.S. government employees — our tax dollars at work. Special thanks to Kenneth Durbrow (@durbrow) and the SRA Toolkit team for building and maintaining the infrastructure that makes projects like this possible.

This project wouldn't exist without NCBI's open infrastructure: the VDB/KAR format, the SDL locate API, EUtils, and public S3 hosting of sequencing data. sracha aims to make it easier for the community to build on that foundation.