Debugging¶

Techniques for debugging pipeline issues.

Debugging Workflow¶

flowchart TD
    A[Rule Failed] --> B{Check Log File}
    B --> C[Clear Error Message?]
    C -->|Yes| D[Fix Specific Issue]
    C -->|No| E[Run Rule Manually]
    E --> F[Reproduce Error?]
    F -->|Yes| G[Debug Interactively]
    F -->|No| H[Check Environment]

Log Files¶

Location¶

Rule logs are stored in the output directory:

Text Only
{output_directory}/logs/{rule_name}/{sample}

Viewing Logs¶

Bash
# View specific rule log
cat results/logs/rebasecall/sample1

# View last 50 lines
tail -50 results/logs/classify_charging/sample1

# Search for errors
grep -i error results/logs/*/*.log

Cluster Job Logs¶

LSFSLURM

Bash
# Find job output files
ls -la *.out *.err

# View job output
cat <job_id>.out

# Get job history
bhist -l <job_id>

Bash
# Find job output
ls -la slurm-*.out

# View job output
cat slurm-<job_id>.out

Dry Run Analysis¶

Check DAG¶

Preview what will run:

Bash
pixi run snakemake -n --configfile=config/config.yml

Print Commands¶

See the actual shell commands:

Bash
pixi run snakemake -n -p --configfile=config/config.yml

Generate DAG Image¶

Visualize the workflow:

Bash
pixi run dag
# Creates dag.svg

Running Rules Manually¶

Identify the Command¶

Get the shell command for a rule:

Bash
pixi run snakemake -n -p <rule_name> --configfile=config/config.yml

Run in Interactive Shell¶

Start a pixi shell:

Bash
pixi shell

Then run commands manually:

Bash
# Example: test bwa alignment
bwa mem -C -t 4 -W 13 -k 6 -T 20 -x ont2d \
    resources/ref/sacCer3-mature-tRNAs-dual-adapt-v2.fa \
    results/fq/sample1.fq.gz \
    | samtools view -Sb - \
    > test.bam

Test Python Scripts¶

Bash
pixi shell

# Test a script directly
python workflow/scripts/get_charging_table.py --help

python workflow/scripts/get_charging_table.py \
    --tag CL \
    results/bam/final/sample1.bam \
    test_output.tsv.gz

Snakemake Debugging¶

Summary Report¶

Get status of all files:

Bash
pixi run snakemake --summary --configfile=config/config.yml

Detailed Report¶

Generate execution report:

Bash
pixi run snakemake --report report.html --configfile=config/config.yml

Reason for Execution¶

See why rules will run:

Bash
pixi run snakemake -n --reason --configfile=config/config.yml

Force Re-run¶

Force a specific rule:

Bash
pixi run snakemake --forcerun <rule> --configfile=config/config.yml

Force from a rule onwards:

Bash
pixi run snakemake --forcerun <rule> --forceall --configfile=config/config.yml

Common Debug Scenarios¶

Input File Missing¶

Symptom:

Text Only
MissingInputException

Debug:

Check if prerequisite completed:
Bash
1
ls -la results/path/to/expected/input

Check DAG for dependencies:

Bash
pixi run snakemake --dag <target> | dot -Tpng > debug.png

Rule Produces Empty Output¶

Debug:

Check the input file:

Bash
samtools view results/bam/input.bam | head

Run command manually with verbose output
Check filtering parameters

Memory Issues¶

Debug:

Monitor memory during execution:

Bash
# On local machine
watch -n 1 free -h

# Check cluster job
bjobs -l <job_id> | grep -i mem

Profile the rule:
Bash
1
/usr/bin/time -v <command>

GPU Issues¶

Debug:

Check GPU availability:
Bash
1
nvidia-smi

Test CUDA:

Bash
python -c "import torch; print(torch.cuda.is_available())"

Check CUDA_VISIBLE_DEVICES:
Bash
1
echo $CUDA_VISIBLE_DEVICES

Environment Issues¶

Check Installed Packages¶

Bash
pixi list

Check Tool Versions¶

Bash
# Dorado
resources/tools/dorado/*/bin/dorado --version

# Modkit (managed by pixi)
pixi run modkit --version

# Snakemake
pixi run snakemake --version

Recreate Environment¶

Bash
# Clean and reinstall
rm -rf .pixi
pixi install

File Inspection¶

BAM Files¶

Bash
# View header
samtools view -H results/bam/final/sample1.bam

# View first few reads
samtools view results/bam/final/sample1.bam | head

# Check specific tags
samtools view results/bam/final/sample1.bam | head -1 | tr '\t' '\n' | grep "^CL:"

# Get statistics
samtools flagstat results/bam/final/sample1.bam

POD5 Files¶

Bash
# Summary
pod5 inspect summary results/pod5/sample1/sample1.pod5

# Read info
pod5 inspect reads results/pod5/sample1/sample1.pod5 | head

TSV Files¶

Bash
# View compressed TSV
zcat results/summary/tables/sample1/sample1.charging.cpm.tsv.gz | column -t | head

# Count lines
zcat results/summary/tables/sample1/sample1.charging_prob.tsv.gz | wc -l

Getting Help¶

Create a Minimal Example¶

When reporting issues:

Identify the failing rule

Create minimal reproduction:

Bash
pixi run snakemake <specific_output> -n -p --configfile=config/config-test.yml

Include:
Error message
Log file contents
Config used
Environment info

File an Issue¶

GitHub Issues

Include:

Markdown
## Environment
- OS: [e.g., CentOS 7]
- Snakemake version: [pixi run snakemake --version]
- Execution mode: [local/LSF/SLURM]

## Error
[Error message]

## Steps to Reproduce
1. [Step 1]
2. [Step 2]

## Log Output

[Paste relevant log output]

Text Only
## Config
```yaml
[Relevant config]

```