Skip to content

First Analysis

This guide walks through analyzing your own aa-tRNA-seq data.

Prerequisites

  • Completed Installation
  • POD5 files from your Oxford Nanopore sequencing run
  • Reference tRNA sequences with adapters (or use the provided S. cerevisiae reference)

Step 1: Organize Your Data

The pipeline expects POD5 files organized by sequencing run:

Text Only
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
/path/to/your/data/
├── run1/
│   ├── pod5_pass/     # Passed reads (required)
│   │   ├── file1.pod5
│   │   └── file2.pod5
│   └── pod5_fail/     # Failed reads (optional)
│       └── file3.pod5
└── run2/
    └── pod5/          # Alternative directory name
        └── file4.pod5

The pipeline searches for POD5 files in pod5_pass/, pod5_fail/, or pod5/ subdirectories.

Step 2: Create Sample File

Create a tab-separated sample file listing your samples:

Text Only
1
2
3
sample1 /path/to/your/data/run1
sample2 /path/to/your/data/run2
sample1 /path/to/your/data/run1_replicate

Multiple Runs per Sample

You can list the same sample ID multiple times with different run paths. POD5 files will be merged before processing.

For multiplexed samples with barcodes, use YAML format instead. See Sample Files.

Step 3: Create Configuration

Create a configuration file for your analysis:

YAML
1
2
3
4
5
6
7
8
9
# Sample file path
samples: config/samples-myproject.tsv

# Output directory
output_directory: "results/myproject"

# Optional: Override base configuration
# fasta: "path/to/custom/reference.fa"
# remora_cca_classifier: "path/to/custom/model.pt"

The configuration inherits defaults from config/config-base.yml. Override any parameter as needed.

Step 4: Dry Run

Preview the pipeline execution:

Bash
1
pixi run snakemake -n --configfile=config/config-myproject.yml

Verify:

  • All samples are detected
  • Expected number of jobs
  • No errors in DAG construction

Step 5: Run Pipeline

Local Execution

For small datasets or testing:

Bash
1
pixi run snakemake --cores 12 --configfile=config/config-myproject.yml

Adjust --cores based on your system.

For production runs, use cluster execution:

Bash
1
pixi run snakemake --profile cluster/lsf --configfile=config/config-myproject.yml
Bash
1
pixi run snakemake --profile cluster/slurm --configfile=config/config-myproject.yml

See Cluster Setup for detailed cluster configuration.

Background Execution

For long-running jobs, run in background:

Bash
1
2
3
nohup pixi run snakemake --profile cluster/lsf \
    --configfile=config/config-myproject.yml \
    > pipeline.log 2>&1 &

Monitor progress:

Bash
1
tail -f pipeline.log

Step 6: Monitor Progress

Check Job Status

Bash
1
2
3
4
5
# For LSF
bjobs -u $USER

# For SLURM
squeue -u $USER

View Snakemake Progress

Bash
1
2
3
4
# Show running jobs
pixi run snakemake --profile cluster/lsf \
    --configfile=config/config-myproject.yml \
    --summary

Check Logs

Rule-specific logs are in the output directory:

Bash
1
ls results/myproject/logs/

Step 7: Verify Outputs

After completion, check your outputs:

Bash
1
ls -la results/myproject/summary/tables/

Key Files to Check

  1. Charging CPM - Per-tRNA charging quantification:

    Bash
    1
    zcat results/myproject/summary/tables/sample1/sample1.charging.cpm.tsv.gz | column -t | head
    

  2. Alignment Stats - Read counts through pipeline:

    Bash
    1
    zcat results/myproject/summary/tables/sample1/sample1.align_stats.tsv.gz | column -t
    

  3. Final BAM - Verify charging tags:

    Bash
    1
    samtools view results/myproject/bam/final/sample1/sample1.bam | head -1
    

Troubleshooting

No POD5 Files Found

Text Only
1
Error: No POD5 files found for sample 'sample1'

Solution: Verify POD5 files exist in pod5_pass/, pod5_fail/, or pod5/ subdirectories.

GPU Memory Error

Text Only
1
CUDA out of memory

Solution: Reduce batch size or run one GPU job at a time. Adjust cluster profile ngpu resource.

Rule Failed

Text Only
1
Error in rule rebasecall

Solution: Check rule-specific log:

Bash
1
cat results/myproject/logs/rebasecall/sample1.log

See Troubleshooting for more solutions.

Next Steps