Quick Start¶

Run the pipeline with test data to verify your installation.

Prerequisites¶

Complete the Installation guide first:

Bash
pixi install
pixi run setup
pixi run dl-test-data

Dry Run¶

Preview what the pipeline will do without executing:

Bash
pixi run dry-run

Expected output:

Text Only
Building DAG of jobs...
Job stats:
job                      count
---------------------  -------
all                          1
bwa_align                    2
classify_charging            2
get_cca_trna                 2
get_cca_trna_cpm             2
merge_pods                   2
rebasecall                   2
transfer_bam_tags            2
...
total                       XX

Run Test Pipeline¶

Execute the pipeline locally with 4 cores:

Bash
pixi run test

This processes two test samples and takes approximately 15-30 minutes depending on your hardware (GPU required).

GPU Required

The rebasecall and classify_charging rules require GPU access. Ensure CUDA is available.

Expected Output¶

After completion, outputs are in .tests/outputs/:

Text Only
.tests/outputs/
├── pod5/
│   └── sample1/
│       └── sample1.pod5              # Merged POD5
├── bam/
│   ├── rebasecall/sample1/
│   │   └── sample1.rbc.bam           # Basecalled BAM
│   ├── aln/sample1/
│   │   └── sample1.aln.bam           # Aligned BAM
│   ├── charging/sample1/
│   │   └── sample1.charging.bam      # Remora classification
│   └── final/sample1/
│       └── sample1.bam               # Final BAM with CL/CM/PT tags
├── fq/
│   └── sample1/
│       └── sample1.fq.gz             # Extracted FASTQ
└── summary/
    ├── tables/sample1/
    │   ├── sample1.charging_prob.tsv.gz  # Per-read charging
    │   ├── sample1.charging.cpm.tsv.gz   # CPM counts
    │   ├── sample1.bcerror.tsv.gz        # Base errors
    │   └── sample1.align_stats.tsv.gz    # Alignment stats
    └── modkit/sample1/
        ├── sample1.pileup.bed.gz         # Modification pileup
        └── sample1.mod_calls.tsv.gz      # Per-read mods

Inspect Key Outputs¶

Charging CPM Table¶

View per-tRNA charging counts:

Bash
zcat .tests/outputs/summary/tables/sample1/sample1.charging.cpm.tsv.gz | head

Output columns:

Column	Description
tRNA	tRNA reference name
counts_charged	Number of charged reads
counts_uncharged	Number of uncharged reads
cpm_charged	Charged counts per million
cpm_uncharged	Uncharged counts per million

Per-Read Charging Probabilities¶

View individual read classifications:

Bash
zcat .tests/outputs/summary/tables/sample1/sample1.charging_prob.tsv.gz | head

Output columns:

Column	Description
read_id	Nanopore read ID
tRNA	Aligned tRNA reference
charging_likelihood	ML score (0-255; ≥200 = charged)

Final BAM Tags¶

The final BAM contains charging classification in tags:

Bash
samtools view .tests/outputs/bam/final/sample1/sample1.bam | head -1 | tr '\t' '\n' | grep -E "^(CL|CM|PT):"

CL:B:C - Charging likelihood (ML tag renamed to avoid conflict)
CM:Z - Charging model metadata (MM tag renamed)
PT:Z - Adapter positions (5' and 3' adapter boundaries)

Run on Cluster¶

For cluster execution with LSF:

Bash
pixi run test-lsf

Or manually with a profile:

Bash
pixi run snakemake --profile cluster/lsf --configfile=config/config-test.yml

See LSF Setup for cluster configuration.

Cleanup¶

Remove test outputs:

Bash
rm -rf .tests/outputs

Next Steps¶

First Analysis - Use your own data
Configuration - Customize parameters
Output Files - Detailed output documentation