GPU Configuration¶
Configure GPU resources for the aa-tRNA-seq pipeline.
GPU Requirements¶
Two rules require GPU access:
| Rule | Purpose | GPU Usage |
|---|---|---|
rebasecall |
Dorado basecalling | CUDA neural network inference |
classify_charging |
Remora classification | PyTorch model inference |
Both rules benefit significantly from GPU acceleration. CPU-only execution is possible but substantially slower.
GPU Resource Flow¶
flowchart LR
subgraph GPU Rules
A[rebasecall<br/>Dorado] --> B[classify_charging<br/>Remora]
end
subgraph Resources
C[POD5 Signal Data]
D[CUDA GPU]
end
C --> A
D --> A
D --> B
Cluster Configuration¶
LSF GPU Settings¶
In cluster/lsf/config.yaml:
| YAML | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
SLURM GPU Settings¶
| YAML | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
Configuration Options¶
GPU Concurrency Limit¶
Control how many GPU jobs run simultaneously:
| YAML | |
|---|---|
1 2 | |
Set this to match your available GPUs or queue limits.
CUDA Toolkit Version¶
The pipeline installs PyTorch with CUDA 12.4 support by default. To use a different CUDA version, set the CUDA_VERSION environment variable before activating the environment:
| Bash | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 | |
Available CUDA wheel tags: cu118, cu121, cu124, cpu
Check your CUDA version
Run nvidia-smi to see your installed CUDA driver version. Choose a PyTorch CUDA version that matches or is lower than your driver version.
Exclusive GPU Access¶
Request exclusive GPU access to avoid memory conflicts:
| YAML | |
|---|---|
1 2 | |
| YAML | |
|---|---|
1 2 | |
GPU Type Selection¶
If your cluster has multiple GPU types:
| YAML | |
|---|---|
1 2 | |
| YAML | |
|---|---|
1 2 | |
Local GPU Execution¶
CUDA_VISIBLE_DEVICES¶
The pipeline respects CUDA_VISIBLE_DEVICES:
| Bash | |
|---|---|
1 2 3 4 5 6 7 | |
Limit GPU Jobs Locally¶
| Bash | |
|---|---|
1 2 | |
Memory Requirements¶
GPU rules also require significant system memory:
| Rule | GPU Memory | System Memory |
|---|---|---|
rebasecall |
~8-16 GB | 24 GB |
classify_charging |
~4-8 GB | 24 GB |
Performance Considerations¶
Dorado (rebasecall)¶
- Processes POD5 signal data through neural network
- Throughput: ~100-500 reads/second depending on GPU
- Benefits from newer GPU architectures (Ampere, Ada Lovelace)
Remora (classify_charging)¶
- Analyzes signal at CCA 3' end
- Lower throughput than Dorado
- Memory usage depends on batch size
Troubleshooting¶
CUDA Out of Memory¶
Symptom:
| Text Only | |
|---|---|
1 | |
Solutions:
-
Ensure exclusive GPU access:
YAML 1 2
set-resources: - rebasecall:lsf_extra="-gpu num=1:j_exclusive=yes" -
Reduce concurrent GPU jobs:
YAML 1 2
resources: - ngpu=4 # Reduce from default -
Check for other GPU processes:
Bash 1nvidia-smi
GPU Not Detected¶
Symptom:
| Text Only | |
|---|---|
1 | |
Solutions:
-
Verify CUDA installation:
Bash 1nvidia-smi -
Check CUDA_VISIBLE_DEVICES:
Bash 1echo $CUDA_VISIBLE_DEVICES -
Verify job is on GPU node:
Bash 1 2 3 4 5
# LSF bjobs -l <job_id> | grep -i gpu # SLURM scontrol show job <job_id> | grep -i gres
Wrong GPU Type¶
Symptom: Job runs on incompatible GPU.
Solutions:
Specify GPU type explicitly in cluster profile:
| YAML | |
|---|---|
1 2 | |
| YAML | |
|---|---|
1 2 | |
Jobs Waiting for GPU¶
Symptom: GPU jobs pending indefinitely.
Solutions:
-
Check GPU queue status:
Bash 1 2 3 4 5
# LSF bqueues -l gpu # SLURM sinfo -p gpu -
Reduce concurrent GPU jobs:
YAML 1 2
resources: - ngpu=2 -
Check fair share limits with your admin.
GPU Monitoring¶
NVIDIA SMI¶
Monitor GPU usage during execution:
| Bash | |
|---|---|
1 2 3 4 5 | |
Check Running GPU Jobs¶
| Bash | |
|---|---|
1 | |
| Bash | |
|---|---|
1 | |
CPU Fallback¶
If GPUs are unavailable, Dorado can run on CPU (much slower):
| Bash | |
|---|---|
1 2 3 | |
Performance Impact
CPU-only basecalling is 10-100x slower than GPU. Not recommended for production use.
Next Steps¶
- LSF Setup - LSF cluster configuration
- SLURM Setup - SLURM cluster configuration