CLI reference

Global options

These options can be used with any subcommand.

Option	Description
`-v, --verbose`	Increase log verbosity (`-v`, `-vv`, `-vvv`)
`-q, --quiet`	Suppress all output except errors
`--version`	Print version
`-h, --help`	Print help

Accession types

sracha accepts three types of accessions:

Type	Prefixes	Example	Description
Run	SRR, ERR, DRR	`SRR2584863`	Single sequencing run (directly downloadable)
Study	SRP, ERP, DRP	`SRP123456`	Study containing multiple runs
BioProject	PRJNA, PRJEB, PRJDB	`PRJNA675068`	BioProject containing multiple runs

Study and BioProject accessions are automatically resolved to their constituent run accessions via the NCBI EUtils API.

Accession lists

The get, fetch, and info commands accept --accession-list to read accessions from a file (one per line). Blank lines and lines starting with # are skipped. This can be combined with positional arguments.

# From a file
sracha get --accession-list SRR_Acc_List.txt

# Mixed: positional + file
sracha get SRR9999999 --accession-list more_accessions.txt

sracha get

Download, convert, and compress SRA data in one shot.

1	`sracha get [OPTIONS] [ACCESSION]...`

Arguments

Argument	Description
`ACCESSION`	One or more accessions (run, study, or BioProject)

Options

Input / output

Option	Default	Description
`--accession-list <FILE>`		Read accessions from a file (one per line)
`-O, --output-dir <DIR>`	`.`	Output directory
`--folder-per-accession`		Place each accession's outputs (FASTQ + metadata sidecar + temp SRA + completion marker + `.sracha-progress` + any `--keep-sra` artifact) inside its own `<output-dir>/<accession>/` subdirectory. The shared `sracha-stats.jsonl` audit log stays at the top level
`--format <FORMAT>`	`sra`	Download format: `sra` (full quality) or `sralite` (simplified quality, smaller)
`-f, --force`		Overwrite existing files

Sequence output

Option	Default	Description
`--split <MODE>`	`split-3`	Split mode: `split-3`, `split-files`, `split-spot`, `interleaved`
`--paired-suffix <STYLE>`	`numeric`	Suffix style for paired/split FASTQ outputs: `numeric` (`_1`/`_2`, matches fasterq-dump and ENA) or `r` (`_R1`/`_R2`, matches Illumina BCL output)
`--fasta`		Output FASTA instead of FASTQ (drops quality scores)
`--seq-defline <TEMPLATE>`		Custom defline template, fasterq-dump `--seq-defline` syntax (see Custom deflines)
`--min-read-len <N>`		Minimum read length filter
`--include-technical`		Include technical reads (skipped by default)
`--metadata <FORMAT>`		Write a `<accession>.metadata.{tsv,json}` sidecar alongside each FASTQ output, capturing BioSample/SAMN, Sample/SRS, BioProject, library strategy/source/selection/layout, instrument, experiment, study, scientific name, tax id, bases, and release dates from the EUtils RunInfo CSV. Values: `tsv`, `json`, `both`
`-Z, --stdout`		Write to stdout (stream interleaved FASTQ, auto-delete temp SRA)

Compression

Option	Default	Description
`--no-gzip`		Disable gzip compression (compressed by default)
`--gzip-level <N>`	`1`	Gzip compression level (1-9)
`--zstd`		Use zstd compression instead of gzip
`--zstd-level <N>`	`3`	Zstd compression level (1-22)

Performance

Option	Default	Description
`-t, --threads <N>`	`8`	Thread count for decode and compression
`--connections <N>`	`8`	HTTP connections per file

Download behavior

Option	Default	Description
`--no-resume`		Disable download resume (re-download from scratch)
`--dry-run`		Resolve accessions and print what would be downloaded, then exit without downloading or decoding
`--dry-run-format <FORMAT>`	`tsv`	Output format for `--dry-run`: `tsv` or `json`
`-y, --yes`		Confirm project downloads and large downloads (>100 GiB)
`--prefer-sdl`		Skip direct S3 and resolve via the SDL API
`--prefer-ena`		Try ENA FASTQ mirrors first; fall back to the NCBI SRA path if ENA has no FASTQ for the accession or its output config is incompatible with the requested split/compression
`--no-runinfo`		Skip EUtils RunInfo API call (derive read structure from VDB metadata)
`--prefetch-depth <N>`	`2`	Number of accessions to download ahead of the decoder. Larger values hide slow networks behind decode at the cost of one extra temp SRA file per step. Multi-accession `get` only
`--keep-sra`		Keep the downloaded SRA file in the output directory instead of deleting it after decode
`--no-progress`		Disable progress bar
`--no-strict`		Downgrade strict-fatal data-integrity anomalies (quality length mismatch, invalid quality bytes, quality overruns, paired-spot violations) from hard failures to warnings. Strict is the default. Benign-fallback counters (SRA-lite all-zero quality blobs, truncated-spot recovery) stay informational either way

sracha fetch

Download SRA files without conversion.

1	`sracha fetch [OPTIONS] [ACCESSION]...`

Arguments

Argument	Description
`ACCESSION`	One or more accessions (run, study, or BioProject)

Options

Option	Default	Description
`--accession-list <FILE>`		Read accessions from a file (one per line)
`-O, --output-dir <DIR>`	`.`	Output directory
`--format <FORMAT>`	`sra`	Download format: `sra` (full quality) or `sralite` (simplified quality, smaller)
`--connections <N>`	`8`	HTTP connections per file
`--no-validate`		Skip MD5 verification after download (verification is on by default)
`-f, --force`		Overwrite existing files
`--no-resume`		Disable download resume (re-download from scratch)
`-y, --yes`		Confirm project downloads and large downloads (>100 GiB)
`--prefer-sdl`		Skip direct S3 and resolve via the SDL API
`--prefer-ena`		Fetch pre-computed FASTQ.gz from ENA's mirror instead of the SRA binary. Falls back to the NCBI path when ENA has no FASTQ for an accession
`--no-progress`		Disable progress bar

sracha fastq

Convert SRA files to FASTQ (or FASTA).

1	`sracha fastq [OPTIONS] <INPUT>...`

Arguments

Argument	Description
`INPUT`	Local `.sra` file path(s) (from `sracha fetch`)

Options

Sequence output

Option	Default	Description
`--split <MODE>`	`split-3`	Split mode: `split-3`, `split-files`, `split-spot`, `interleaved`
`--paired-suffix <STYLE>`	`numeric`	Suffix style for paired/split FASTQ outputs: `numeric` (`_1`/`_2`, matches fasterq-dump and ENA) or `r` (`_R1`/`_R2`, matches Illumina BCL output)
`--fasta`		Output FASTA instead of FASTQ (drops quality scores)
`--seq-defline <TEMPLATE>`		Custom defline template, fasterq-dump `--seq-defline` syntax (see Custom deflines)
`--min-read-len <N>`		Minimum read length filter
`--include-technical`		Include technical reads (skipped by default)
`-Z, --stdout`		Write to stdout (implies `--no-progress`)

Compression

Option	Default	Description
`--no-gzip`		Disable gzip compression (compressed by default)
`--gzip-level <N>`	`1`	Gzip compression level (1-9)
`--zstd`		Use zstd compression instead of gzip
`--zstd-level <N>`	`3`	Zstd compression level (1-22)

Other

Option	Default	Description
`-t, --threads <N>`	`8`	Thread count for decode and compression
`-O, --output-dir <DIR>`	`.`	Output directory
`--folder-per-accession`		Place each accession's outputs inside its own `<output-dir>/<accession>/` subdirectory
`-f, --force`		Overwrite existing files
`--no-progress`		Disable progress bar
`--no-strict`		Downgrade strict-fatal data-integrity anomalies (quality length mismatch, invalid quality bytes, quality overruns, paired-spot violations) from hard failures to warnings. Strict is the default. Benign-fallback counters (SRA-lite all-zero quality blobs, truncated-spot recovery) stay informational either way

sracha info

Show accession metadata, or inspect a local .sra file.

1	`sracha info [OPTIONS] [ACCESSION_OR_PATH]...`

Arguments

Argument	Description
`ACCESSION_OR_PATH`	Accession (run, study, or BioProject) or a local `.sra` file path

Options

Option	Default	Description
`--accession-list <FILE>`		Read accessions from a file (one per line)
`--prefer-ena`		Also fetch ENA's FASTQ filereport and show it alongside the NCBI info
`--format <FMT>`	`table`	Output format: `table` (human-readable), `tsv`, or `csv`

For accessions, displays file sizes, available formats, download mirrors, and quality information. Study and BioProject accessions are resolved to runs first.

For local file paths, opens the KAR archive directly (no network) and prints its table of contents, schema, and metadata. Paths starting with ~/ are expanded to $HOME.

--format tsv and --format csv emit a single header row followed by one record per accession. Columns: accession, archive_type (SRA/cSRA), layout (SINGLE/PAIRED/N-read), nreads, spots, size_bytes, platform, md5. Missing fields are empty strings. Local .sra paths and --prefer-ena are ignored with a stderr warning in these formats.

$ sracha info --format tsv SRR2584863 SRR14724462
accession   archive_type    layout  nreads  spots   size_bytes  platform    md5
SRR2584863  SRA PAIRED  2   1553259 302057279   ILLUMINA    c486ca786ca83ec3cef04b7e32e1aa08
SRR14724462 SRA PAIRED  2   41135235    4057553143  ILLUMINA    cfb98d8db26ad9ad28c501a4115f0cc5

sracha validate

Validate SRA file integrity by decoding all records and checking for errors.

1	`sracha validate [OPTIONS] <INPUT>...`

Arguments

Argument	Description
`INPUT`	SRA file(s) to validate

Options

Option	Default	Description
`-t, --threads <N>`	`8`	Thread count for decode
`--no-progress`		Disable progress bar
`--md5 <HASH>`		Expected MD5 hex; fail on mismatch. With multiple inputs every file must match
`--offline`		Skip the SDL lookup for the expected MD5 (air-gapped use)

sracha vdb

Inspect the VDB structure of a local .sra file. Pure-Rust replacement for vdb-dump — no network, no C FFI, no subprocess to sra-tools.

1	`sracha vdb <SUBCOMMAND> <FILE> [OPTIONS]`

sracha vdb info

Print summary metadata: schema, platform, table row counts, load timestamp, and formatter / loader / update software events.

1	`sracha vdb info <FILE> [--json]`

Option	Default	Description
`--json`		Emit a single JSON object instead of human-readable text

sracha vdb tables

List tables in the archive. Only meaningful for Database archives (cSRA / aligned); flat Tables print a note.

1	`sracha vdb tables <FILE>`

sracha vdb columns

List columns in a table.

1	`sracha vdb columns <FILE> [-T TABLE] [-s]`

Option	Default	Description
`-T, --table <NAME>`	`SEQUENCE` (or first)	Table to inspect
`-s, --stats`		Show row counts, blob counts, and first-blob header stats per column

sracha vdb meta

Dump the metadata tree (schema / stats / LOAD / SOFTWARE nodes).

1	`sracha vdb meta <FILE> [-T TABLE] [-P PATH] [-d DEPTH] [--db]`

Option	Default	Description
`-T, --table <NAME>`	`SEQUENCE`	Table whose metadata tree to walk
`-P, --path <PATH>`		Restrict to a sub-path like `STATS/TABLE` or `LOAD`
`-d, --depth <N>`		Limit recursion depth below the chosen sub-path
`--db`		Walk the database-level tree (root `md/cur`) instead of a table tree

sracha vdb schema

Print the embedded schema text.

1	`sracha vdb schema <FILE>`

sracha vdb id-range

Print the first row id and row count for a table/column.

1	`sracha vdb id-range <FILE> [-T TABLE] [-C COLUMN]`

Option	Default	Description
`-T, --table <NAME>`	`SEQUENCE` (or first)	Table to inspect
`-C, --column <NAME>`	first alphabetically	Column to read

Custom deflines

--seq-defline <TEMPLATE> (available on both sracha get and sracha fastq) replaces the built-in FASTQ/FASTA header with a template, using the same $-variable syntax as fasterq-dump's --seq-defline:

Variable	Expands to
`$ac`	Run accession (e.g. `SRR12345678`)
`$si`	Spot id — 1-based row number
`$ri`	Read id — 1-based read number within the spot
`$sn`	Spot name from the NAME column (empty when the archive has none)
`$rl`	Read length in bases
`$$`	A literal `$`

# Match a fasterq-dump pipeline that used --seq-defline '@$ac.$si.$ri'
sracha get --seq-defline '@$ac.$si.$ri' --split split-files SRR12345678
#  -> @SRR12345678.1.1
#     @SRR12345678.1.2   (mate 2 of spot 1)

Notes:

A leading @ (or > for FASTA) is optional — it is stripped, and sracha adds the correct record prefix automatically.
The + (quality) line always mirrors the sequence defline. There is no separate --qual-defline.
When a template is given, the split-spot/interleaved modes do not add their implicit <spot>/<mate> suffix — use $ri in the template to distinguish mates instead.
$sg (spot-group) is not supported; a template using it is rejected at startup. Reference-compressed (cSRA) archives have no NAME column, so $sn expands to empty there.

Coming from sra-tools

How common fasterq-dump / fastq-dump options map to sracha:

sra-tools option	sracha equivalent	Notes
`-O, --outdir <DIR>`	`-O, --output-dir <DIR>`	Same
`-e, --threads <N>`	`-t, --threads <N>`	Decode + compression threads
`--split-3`	`--split split-3`	Default in both
`--split-files`	`--split split-files`	One file per read
`--split-spot`	`--split split-spot`	All reads of a spot in one file
`--concatenate-reads` / no split	`--split interleaved`	Single interleaved stream
`-Z, --stdout`	`-Z, --stdout`	Interleaved FASTQ to stdout
`--fasta`	`--fasta`	Drops quality
`--seq-defline <FMT>`	`--seq-defline <FMT>`	Common variables; see above
`--include-technical`	`--include-technical`	Technical reads skipped by default in both
`-M, --min-read-len <N>`	`--min-read-len <N>`	Same
`-p, --progress`	(on by default)	Disable with `--no-progress`
`-t, --temp <DIR>`	(automatic)	Temp files are managed internally
`--qual-defline <FMT>`	(unsupported)	The `+` line mirrors `--seq-defline`
`--rowid-as-name`	(unsupported)	Use `--seq-defline` with `$si`
gzip via external pipe	`--gzip-level` / `--zstd`	Compression is built in (gzip on by default)

sracha-only conveniences with no fasterq-dump equivalent: parallel chunked download, --prefer-ena, --metadata sidecars, --folder-per-accession, --zstd, and direct accession resolution from a study/BioProject.