CLI reference
Global options
These options can be used with any subcommand.
| Option | Description |
|---|---|
-v, --verbose |
Increase log verbosity (-v, -vv, -vvv) |
-q, --quiet |
Suppress all output except errors |
--version |
Print version |
-h, --help |
Print help |
Accession types
sracha accepts three types of accessions:
| Type | Prefixes | Example | Description |
|---|---|---|---|
| Run | SRR, ERR, DRR | SRR2584863 |
Single sequencing run (directly downloadable) |
| Study | SRP, ERP, DRP | SRP123456 |
Study containing multiple runs |
| BioProject | PRJNA, PRJEB, PRJDB | PRJNA675068 |
BioProject containing multiple runs |
Study and BioProject accessions are automatically resolved to their constituent run accessions via the NCBI EUtils API.
Accession lists
The get, fetch, and info commands accept --accession-list to read
accessions from a file (one per line). Blank lines and lines starting with
# are skipped. This can be combined with positional arguments.
sracha get
Download, convert, and compress SRA data in one shot.
Arguments
| Argument | Description |
|---|---|
ACCESSION |
One or more accessions (run, study, or BioProject) |
Options
Input / output
| Option | Default | Description |
|---|---|---|
--accession-list <FILE> |
Read accessions from a file (one per line) | |
-O, --output-dir <DIR> |
. |
Output directory |
--folder-per-accession |
Place each accession's outputs (FASTQ + metadata sidecar + temp SRA + completion marker + .sracha-progress + any --keep-sra artifact) inside its own <output-dir>/<accession>/ subdirectory. The shared sracha-stats.jsonl audit log stays at the top level |
|
--format <FORMAT> |
sra |
Download format: sra (full quality) or sralite (simplified quality, smaller) |
-f, --force |
Overwrite existing files |
Sequence output
| Option | Default | Description |
|---|---|---|
--split <MODE> |
split-3 |
Split mode: split-3, split-files, split-spot, interleaved |
--paired-suffix <STYLE> |
numeric |
Suffix style for paired/split FASTQ outputs: numeric (_1/_2, matches fasterq-dump and ENA) or r (_R1/_R2, matches Illumina BCL output) |
--fasta |
Output FASTA instead of FASTQ (drops quality scores) | |
--seq-defline <TEMPLATE> |
Custom defline template, fasterq-dump --seq-defline syntax (see Custom deflines) |
|
--min-read-len <N> |
Minimum read length filter | |
--include-technical |
Include technical reads (skipped by default) | |
--metadata <FORMAT> |
Write a <accession>.metadata.{tsv,json} sidecar alongside each FASTQ output, capturing BioSample/SAMN, Sample/SRS, BioProject, library strategy/source/selection/layout, instrument, experiment, study, scientific name, tax id, bases, and release dates from the EUtils RunInfo CSV. Values: tsv, json, both |
|
-Z, --stdout |
Write to stdout (stream interleaved FASTQ, auto-delete temp SRA) |
Compression
| Option | Default | Description |
|---|---|---|
--no-gzip |
Disable gzip compression (compressed by default) | |
--gzip-level <N> |
1 |
Gzip compression level (1-9) |
--zstd |
Use zstd compression instead of gzip | |
--zstd-level <N> |
3 |
Zstd compression level (1-22) |
Performance
| Option | Default | Description |
|---|---|---|
-t, --threads <N> |
8 |
Thread count for decode and compression |
--connections <N> |
8 |
HTTP connections per file |
Download behavior
| Option | Default | Description |
|---|---|---|
--no-resume |
Disable download resume (re-download from scratch) | |
--dry-run |
Resolve accessions and print what would be downloaded, then exit without downloading or decoding | |
--dry-run-format <FORMAT> |
tsv |
Output format for --dry-run: tsv or json |
-y, --yes |
Confirm project downloads and large downloads (>100 GiB) | |
--prefer-sdl |
Skip direct S3 and resolve via the SDL API | |
--prefer-ena |
Try ENA FASTQ mirrors first; fall back to the NCBI SRA path if ENA has no FASTQ for the accession or its output config is incompatible with the requested split/compression | |
--no-runinfo |
Skip EUtils RunInfo API call (derive read structure from VDB metadata) | |
--prefetch-depth <N> |
2 |
Number of accessions to download ahead of the decoder. Larger values hide slow networks behind decode at the cost of one extra temp SRA file per step. Multi-accession get only |
--keep-sra |
Keep the downloaded SRA file in the output directory instead of deleting it after decode | |
--no-progress |
Disable progress bar | |
--no-strict |
Downgrade strict-fatal data-integrity anomalies (quality length mismatch, invalid quality bytes, quality overruns, paired-spot violations) from hard failures to warnings. Strict is the default. Benign-fallback counters (SRA-lite all-zero quality blobs, truncated-spot recovery) stay informational either way |
sracha fetch
Download SRA files without conversion.
Arguments
| Argument | Description |
|---|---|
ACCESSION |
One or more accessions (run, study, or BioProject) |
Options
| Option | Default | Description |
|---|---|---|
--accession-list <FILE> |
Read accessions from a file (one per line) | |
-O, --output-dir <DIR> |
. |
Output directory |
--format <FORMAT> |
sra |
Download format: sra (full quality) or sralite (simplified quality, smaller) |
--connections <N> |
8 |
HTTP connections per file |
--no-validate |
Skip MD5 verification after download (verification is on by default) | |
-f, --force |
Overwrite existing files | |
--no-resume |
Disable download resume (re-download from scratch) | |
-y, --yes |
Confirm project downloads and large downloads (>100 GiB) | |
--prefer-sdl |
Skip direct S3 and resolve via the SDL API | |
--prefer-ena |
Fetch pre-computed FASTQ.gz from ENA's mirror instead of the SRA binary. Falls back to the NCBI path when ENA has no FASTQ for an accession | |
--no-progress |
Disable progress bar |
sracha fastq
Convert SRA files to FASTQ (or FASTA).
Arguments
| Argument | Description |
|---|---|
INPUT |
Local .sra file path(s) (from sracha fetch) |
Options
Sequence output
| Option | Default | Description |
|---|---|---|
--split <MODE> |
split-3 |
Split mode: split-3, split-files, split-spot, interleaved |
--paired-suffix <STYLE> |
numeric |
Suffix style for paired/split FASTQ outputs: numeric (_1/_2, matches fasterq-dump and ENA) or r (_R1/_R2, matches Illumina BCL output) |
--fasta |
Output FASTA instead of FASTQ (drops quality scores) | |
--seq-defline <TEMPLATE> |
Custom defline template, fasterq-dump --seq-defline syntax (see Custom deflines) |
|
--min-read-len <N> |
Minimum read length filter | |
--include-technical |
Include technical reads (skipped by default) | |
-Z, --stdout |
Write to stdout (implies --no-progress) |
Compression
| Option | Default | Description |
|---|---|---|
--no-gzip |
Disable gzip compression (compressed by default) | |
--gzip-level <N> |
1 |
Gzip compression level (1-9) |
--zstd |
Use zstd compression instead of gzip | |
--zstd-level <N> |
3 |
Zstd compression level (1-22) |
Other
| Option | Default | Description |
|---|---|---|
-t, --threads <N> |
8 |
Thread count for decode and compression |
-O, --output-dir <DIR> |
. |
Output directory |
--folder-per-accession |
Place each accession's outputs inside its own <output-dir>/<accession>/ subdirectory |
|
-f, --force |
Overwrite existing files | |
--no-progress |
Disable progress bar | |
--no-strict |
Downgrade strict-fatal data-integrity anomalies (quality length mismatch, invalid quality bytes, quality overruns, paired-spot violations) from hard failures to warnings. Strict is the default. Benign-fallback counters (SRA-lite all-zero quality blobs, truncated-spot recovery) stay informational either way |
sracha info
Show accession metadata, or inspect a local .sra file.
Arguments
| Argument | Description |
|---|---|
ACCESSION_OR_PATH |
Accession (run, study, or BioProject) or a local .sra file path |
Options
| Option | Default | Description |
|---|---|---|
--accession-list <FILE> |
Read accessions from a file (one per line) | |
--prefer-ena |
Also fetch ENA's FASTQ filereport and show it alongside the NCBI info | |
--format <FMT> |
table |
Output format: table (human-readable), tsv, or csv |
For accessions, displays file sizes, available formats, download mirrors, and quality information. Study and BioProject accessions are resolved to runs first.
For local file paths, opens the KAR archive directly (no network) and
prints its table of contents, schema, and metadata. Paths starting with
~/ are expanded to $HOME.
--format tsv and --format csv emit a single header row followed by one
record per accession. Columns: accession, archive_type (SRA/cSRA),
layout (SINGLE/PAIRED/N-read), nreads, spots, size_bytes,
platform, md5. Missing fields are empty strings. Local .sra paths and
--prefer-ena are ignored with a stderr warning in these formats.
sracha validate
Validate SRA file integrity by decoding all records and checking for errors.
Arguments
| Argument | Description |
|---|---|
INPUT |
SRA file(s) to validate |
Options
| Option | Default | Description |
|---|---|---|
-t, --threads <N> |
8 |
Thread count for decode |
--no-progress |
Disable progress bar | |
--md5 <HASH> |
Expected MD5 hex; fail on mismatch. With multiple inputs every file must match | |
--offline |
Skip the SDL lookup for the expected MD5 (air-gapped use) |
sracha vdb
Inspect the VDB structure of a local .sra file. Pure-Rust
replacement for vdb-dump — no network, no C FFI, no subprocess to
sra-tools.
sracha vdb info
Print summary metadata: schema, platform, table row counts, load timestamp, and formatter / loader / update software events.
| Option | Default | Description |
|---|---|---|
--json |
Emit a single JSON object instead of human-readable text |
sracha vdb tables
List tables in the archive. Only meaningful for Database archives (cSRA / aligned); flat Tables print a note.
sracha vdb columns
List columns in a table.
| Option | Default | Description |
|---|---|---|
-T, --table <NAME> |
SEQUENCE (or first) |
Table to inspect |
-s, --stats |
Show row counts, blob counts, and first-blob header stats per column |
sracha vdb meta
Dump the metadata tree (schema / stats / LOAD / SOFTWARE nodes).
| Option | Default | Description |
|---|---|---|
-T, --table <NAME> |
SEQUENCE |
Table whose metadata tree to walk |
-P, --path <PATH> |
Restrict to a sub-path like STATS/TABLE or LOAD |
|
-d, --depth <N> |
Limit recursion depth below the chosen sub-path | |
--db |
Walk the database-level tree (root md/cur) instead of a table tree |
sracha vdb schema
Print the embedded schema text.
sracha vdb id-range
Print the first row id and row count for a table/column.
| Option | Default | Description |
|---|---|---|
-T, --table <NAME> |
SEQUENCE (or first) |
Table to inspect |
-C, --column <NAME> |
first alphabetically | Column to read |
Custom deflines
--seq-defline <TEMPLATE> (available on both sracha get and sracha fastq)
replaces the built-in FASTQ/FASTA header with a template, using the same
$-variable syntax as fasterq-dump's --seq-defline:
| Variable | Expands to |
|---|---|
$ac |
Run accession (e.g. SRR12345678) |
$si |
Spot id — 1-based row number |
$ri |
Read id — 1-based read number within the spot |
$sn |
Spot name from the NAME column (empty when the archive has none) |
$rl |
Read length in bases |
$$ |
A literal $ |
Notes:
- A leading
@(or>for FASTA) is optional — it is stripped, and sracha adds the correct record prefix automatically. - The
+(quality) line always mirrors the sequence defline. There is no separate--qual-defline. - When a template is given, the
split-spot/interleavedmodes do not add their implicit<spot>/<mate>suffix — use$riin the template to distinguish mates instead. $sg(spot-group) is not supported; a template using it is rejected at startup. Reference-compressed (cSRA) archives have no NAME column, so$snexpands to empty there.
Coming from sra-tools
How common fasterq-dump / fastq-dump options map to sracha:
| sra-tools option | sracha equivalent | Notes |
|---|---|---|
-O, --outdir <DIR> |
-O, --output-dir <DIR> |
Same |
-e, --threads <N> |
-t, --threads <N> |
Decode + compression threads |
--split-3 |
--split split-3 |
Default in both |
--split-files |
--split split-files |
One file per read |
--split-spot |
--split split-spot |
All reads of a spot in one file |
--concatenate-reads / no split |
--split interleaved |
Single interleaved stream |
-Z, --stdout |
-Z, --stdout |
Interleaved FASTQ to stdout |
--fasta |
--fasta |
Drops quality |
--seq-defline <FMT> |
--seq-defline <FMT> |
Common variables; see above |
--include-technical |
--include-technical |
Technical reads skipped by default in both |
-M, --min-read-len <N> |
--min-read-len <N> |
Same |
-p, --progress |
(on by default) | Disable with --no-progress |
-t, --temp <DIR> |
(automatic) | Temp files are managed internally |
--qual-defline <FMT> |
(unsupported) | The + line mirrors --seq-defline |
--rowid-as-name |
(unsupported) | Use --seq-defline with $si |
| gzip via external pipe | --gzip-level / --zstd |
Compression is built in (gzip on by default) |
sracha-only conveniences with no fasterq-dump equivalent: parallel chunked
download, --prefer-ena, --metadata sidecars, --folder-per-accession,
--zstd, and direct accession resolution from a study/BioProject.