First Analysis¶
This guide walks through analyzing your own aa-tRNA-seq data.
Prerequisites¶
- Completed Installation
- POD5 files from your Oxford Nanopore sequencing run
- Reference tRNA sequences with adapters (or use the provided S. cerevisiae reference)
Step 1: Organize Your Data¶
The pipeline expects POD5 files organized by sequencing run:
| Text Only | |
|---|---|
1 2 3 4 5 6 7 8 9 10 | |
The pipeline searches for POD5 files in pod5_pass/, pod5_fail/, or pod5/ subdirectories.
Step 2: Create Sample File¶
Create a tab-separated sample file listing your samples:
| Text Only | |
|---|---|
1 2 3 | |
Multiple Runs per Sample
You can list the same sample ID multiple times with different run paths. POD5 files will be merged before processing.
For multiplexed samples with barcodes, use YAML format instead. See Sample Files.
Step 3: Create Configuration¶
Create a configuration file for your analysis:
| YAML | |
|---|---|
1 2 3 4 5 6 7 8 9 | |
The configuration inherits defaults from config/config-base.yml. Override any parameter as needed.
Step 4: Dry Run¶
Preview the pipeline execution:
| Bash | |
|---|---|
1 | |
Verify:
- All samples are detected
- Expected number of jobs
- No errors in DAG construction
Step 5: Run Pipeline¶
Local Execution¶
For small datasets or testing:
| Bash | |
|---|---|
1 | |
Adjust --cores based on your system.
Cluster Execution (Recommended)¶
For production runs, use cluster execution:
| Bash | |
|---|---|
1 | |
| Bash | |
|---|---|
1 | |
See Cluster Setup for detailed cluster configuration.
Background Execution¶
For long-running jobs, run in background:
| Bash | |
|---|---|
1 2 3 | |
Monitor progress:
| Bash | |
|---|---|
1 | |
Step 6: Monitor Progress¶
Check Job Status¶
| Bash | |
|---|---|
1 2 3 4 5 | |
View Snakemake Progress¶
| Bash | |
|---|---|
1 2 3 4 | |
Check Logs¶
Rule-specific logs are in the output directory:
| Bash | |
|---|---|
1 | |
Step 7: Verify Outputs¶
After completion, check your outputs:
| Bash | |
|---|---|
1 | |
Key Files to Check¶
-
Charging CPM - Per-tRNA charging quantification:
Bash 1zcat results/myproject/summary/tables/sample1/sample1.charging.cpm.tsv.gz | column -t | head -
Alignment Stats - Read counts through pipeline:
Bash 1zcat results/myproject/summary/tables/sample1/sample1.align_stats.tsv.gz | column -t -
Final BAM - Verify charging tags:
Bash 1samtools view results/myproject/bam/final/sample1/sample1.bam | head -1
Troubleshooting¶
No POD5 Files Found¶
| Text Only | |
|---|---|
1 | |
Solution: Verify POD5 files exist in pod5_pass/, pod5_fail/, or pod5/ subdirectories.
GPU Memory Error¶
| Text Only | |
|---|---|
1 | |
Solution: Reduce batch size or run one GPU job at a time. Adjust cluster profile ngpu resource.
Rule Failed¶
| Text Only | |
|---|---|
1 | |
Solution: Check rule-specific log:
| Bash | |
|---|---|
1 | |
See Troubleshooting for more solutions.
Next Steps¶
- Output Files - Understand all output files
- Configuration - Customize parameters
- Workflow Overview - Understand the pipeline stages