Demultiplexing¶
Guide for using WarpDemuX barcode demultiplexing with pooled/multiplexed samples.
Overview¶
WarpDemuX enables barcode demultiplexing for pooled Nano-tRNAseq sequencing runs. Multiple samples can be sequenced together and separated computationally based on barcode signal patterns.
flowchart LR
A[Pooled POD5<br/>4 barcoded samples] --> B[WarpDemuX]
B --> C[Sample A<br/>barcode03]
B --> D[Sample B<br/>barcode04]
B --> E[Sample C<br/>barcode05]
B --> F[Sample D<br/>barcode07]
When to Use Demultiplexing¶
Use WarpDemuX when:
- Multiple samples were pooled in a single sequencing run
- Samples were prepared with WarpDemuX barcodes
- Using the Nano-tRNAseq protocol
Do not use when:
- Samples were sequenced individually (1 sample per run)
- Using Thomas splint adapter (incompatible)
- Barcodes were not used during library prep
Setup¶
1. Install WarpDemuX¶
| Bash | |
|---|---|
1 | |
This installs WarpDemuX along with other pipeline tools (dorado, remora).
2. Create YAML Sample File¶
Create a sample file in YAML format (required for demultiplexing):
| YAML | |
|---|---|
1 2 3 4 5 6 7 8 9 | |
3. Enable in Configuration¶
Create a config file with demux enabled:
| YAML | |
|---|---|
1 2 3 4 5 6 7 8 | |
Barcode Kits¶
WarpDemuX provides adapter-based barcode demultiplexing for Oxford Nanopore direct RNA sequencing. This pipeline uses tRNA-specific WarpDemuX models trained for the Nano-tRNAseq protocol.
Naming Convention¶
Model names follow the format: WDX[n_barcodes][alt_set]_tRNA_rna004_v1_0
WDX— WarpDemuX prefix[n_barcodes]— number of barcodes in the set (e.g.,4)[alt_set]— optional letter for alternative adapter sets (e.g.,b)_tRNA_— indicates tRNA-specific modelrna004_v1_0— ONT RNA004 chemistry version
Available Kits¶
| Kit | # Barcodes | Barcode IDs | Notes |
|---|---|---|---|
WDX4_tRNA_rna004_v1_0 |
4 | barcode03, barcode04, barcode05, barcode07 | Recommended, +3-7% recovery |
WDX4b_tRNA_rna004_v1_0 |
4 | barcode04, barcode05, barcode07, barcode11 | Alternative adapter set |
Standard RNA004 Models
WarpDemuX also offers standard RNA004 models (WDX4, WDX6, WDX10) for mRNA and other direct RNA applications. See the WarpDemuX README for details. This pipeline requires the _tRNA_ variants.
Protocol Compatibility
WarpDemuX-tRNA models are developed specifically for the Nano-tRNAseq protocol. They do NOT work with data using the Thomas splint adapter.
Sample File Format¶
YAML Structure¶
| YAML | |
|---|---|
1 2 3 4 5 | |
Multiple Runs¶
| YAML | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 | |
Mixed Runs (Demux + Direct)¶
| YAML | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 | |
Dual Barcoding (WDX + EDX)¶
The pipeline supports dual barcoding — combining WDX (5' signal-based) and EDX (3' adapter sequence-based) barcodes for two-axis demultiplexing.
- WDX (WarpDemuX): 5' signal barcode predicted from the raw nanopore signal by WarpDemuX. This is the primary demultiplexing barcode used to split POD5 reads into samples.
- EDX: 3' adapter sequence variant (e.g.,
edx01,edx02). Different adapter sequences at the 3' end identify which adapter was used during library prep. EDX filtering happens early — right after basecalling, before alignment — so downstream rules only process matching reads.
When to Use Dual Barcoding¶
Use dual barcoding when samples are multiplexed with both WDX adapters at the 5' end and different EDX adapter sequences at the 3' end. This enables true two-axis demultiplexing: WDX splits reads at the POD5 level, then EDX splits both FASTQ and POD5 before alignment based on 3' adapter identity. An optional concordance analysis can verify agreement between the two axes.
Dict Format for Samples¶
When using dual barcoding, specify sample values as a dict with wdx and edx keys instead of a plain barcode string. The edx value must match a name from adapters.three_prime in the config (e.g., edx01, edx02):
| YAML | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 | |
EDX Early Splitting¶
When a sample has an edx assignment, the pipeline detects 3' adapter identity on the unaligned BAM right after basecalling, then splits both FASTQ and POD5 by adapter before alignment. This avoids redundant processing when two samples share a WDX barcode but have different EDX adapters.
The EDX splitting flow:
| Text Only | |
|---|---|
1 2 3 | |
Reads with no detected 3' adapter get "none" in the adapter detection TSV and are excluded from all samples. For samples without an edx assignment, the pipeline flow is unchanged.
EDX Concordance Output (QC)¶
When samples have EDX assignments and edx.enabled: true, the edx_concordance rule produces a QC concordance table at summary/edx/edx_concordance.tsv.gz. This table shows how reads assigned to each WDX sample distribute across EDX adapter identities, useful for verifying demultiplexing accuracy. The concordance is computed from the pre-alignment adapter detection TSVs (which contain ALL reads), not from final BAMs.
Output columns:
| Column | Description |
|---|---|
sample |
WDX sample name |
edx_adapter |
3' adapter identity detected (e.g., edx01, edx02, none) |
n_reads |
Number of reads with this adapter |
pct |
Percentage of the sample's reads with this adapter |
Enable EDX concordance
EDX concordance output requires edx.enabled: true in the pipeline config. The rule runs automatically when enabled and at least one sample has an edx assignment.
Debugging unmatched reads
The full adapter detection TSV at demux/edx/{sample}/{sample}.edx_adapters.tsv.gz records every read's adapter assignment including "none", useful for debugging.
Pipeline Flow¶
With demultiplexing enabled, the pipeline adds these steps before standard processing:
flowchart TB
subgraph Input
A[Pooled POD5 files]
end
subgraph Demux[Demultiplexing Steps]
B[warpdemux<br/>Predict barcodes]
C[parse_warpdemux<br/>Create mapping]
D[extract_sample_reads<br/>Filter by barcode]
E[split_pod5<br/>Split per sample]
end
subgraph Standard[Standard Pipeline]
F[rebasecall]
G[bwa_align]
H[classify_charging]
I[...]
end
A --> B --> C --> D --> E --> F --> G --> H --> I
Demux Rules¶
warpdemux¶
Runs WarpDemuX barcode prediction directly on raw POD5 files.
| Property | Value |
|---|---|
| Input | Raw POD5 files from run directory |
| Output | demux/warpdemux_output/{run_id}/ |
| Threads | Configurable (default: 8) |
parse_warpdemux¶
Parses WarpDemuX predictions to create barcode mapping.
| Property | Value |
|---|---|
| Input | WarpDemuX output directory |
| Output | demux/read_ids/{run_id}/barcode_mapping.tsv.gz |
Output format:
| Column | Description |
|---|---|
| read_id | Nanopore read identifier |
| predicted_barcode | Assigned barcode (e.g., "barcode03") |
extract_sample_reads¶
Extracts read IDs for a specific sample's barcode.
| Property | Value |
|---|---|
| Input | Barcode mapping file |
| Output | demux/read_ids/{sample}.txt |
split_pod5¶
Filters raw POD5 files by sample using read ID list.
| Property | Value |
|---|---|
| Input | Raw POD5 files from run, read ID list |
| Output | demux/pod5/{sample}.pod5 |
detect_edx_adapters¶
Detects 3' adapter identity per read on the unaligned BAM (before alignment). Produces a gzipped TSV mapping each read_id to its best-matching 3' adapter name. Only runs for samples with an edx assignment.
| Property | Value |
|---|---|
| Input | Rebasecalled uBAM |
| Output | demux/edx/{sample}/{sample}.edx_adapters.tsv.gz |
| Script | workflow/scripts/detect_3p_adapters.py |
extract_edx_read_ids¶
Extracts read IDs matching the sample's expected EDX adapter from the detection TSV.
| Property | Value |
|---|---|
| Input | Adapter detection TSV |
| Output | demux/edx/{sample}/{sample}.edx_read_ids.txt |
filter_fastq_by_edx¶
Extracts FASTQ for reads matching the sample's EDX adapter from the uBAM.
| Property | Value |
|---|---|
| Input | Rebasecalled uBAM + read IDs |
| Output | demux/edx/fq/{sample}/{sample}.fq.gz |
filter_pod5_by_edx¶
Filters POD5 to keep only reads matching the sample's EDX adapter.
| Property | Value |
|---|---|
| Input | WDX-split (or merged) POD5 + read IDs |
| Output | demux/edx/pod5/{sample}/{sample}.pod5 |
edx_concordance¶
Builds a concordance table of WDX sample assignment vs EDX (3' adapter) identity. Uses pre-alignment adapter detection TSVs (which contain ALL reads) rather than final BAMs. Only runs when edx.enabled: true and samples have EDX assignments.
| Property | Value |
|---|---|
| Input | Adapter detection TSVs for all EDX-assigned samples |
| Output | summary/edx/edx_concordance.tsv.gz |
| Script | workflow/scripts/edx_concordance.py |
Running¶
Dry Run¶
| Bash | |
|---|---|
1 | |
Execute¶
| Bash | |
|---|---|
1 2 3 4 5 | |
Output Structure¶
With demultiplexing, outputs include:
| Text Only | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | |
Configuration Options¶
| YAML | |
|---|---|
1 2 3 4 5 | |
| Option | Description | Default |
|---|---|---|
enabled |
Enable demultiplexing | false |
barcode_kit |
Default barcode kit | WDX4_tRNA_rna004_v1_0 |
save_boundaries |
Save demux boundaries | true |
threads |
WarpDemuX threads | 8 |
Troubleshooting¶
No Reads for Sample¶
If a sample has zero reads after demux:
- Check barcode assignment in sample file
- Verify barcode kit matches library prep
- Check
demux_summary.tsv.gzfor barcode distribution
| Bash | |
|---|---|
1 | |
WarpDemuX Fails¶
Common issues:
- Memory: Increase
mem_mbin cluster profile forwarpdemuxrule - Model not found: Verify
barcode_kitname is correct - Incompatible data: WarpDemuX-tRNA only works with Nano-tRNAseq protocol
Unbalanced Barcodes¶
If barcode distribution is very unbalanced:
- Check library prep QC
- Review loading concentrations
- Consider if samples have different RNA amounts
Best Practices¶
-
Verify barcode distribution before running full pipeline:
Bash 1 2
pixi run snakemake demux/read_ids/{run_id}/demux_summary.tsv.gz \ --configfile=config/config-demux.yml -
Use recommended kit (
WDX4_tRNA_rna004_v1_0) for best recovery -
Check sample file format carefully - YAML indentation matters
-
Monitor memory - WarpDemuX can require 32GB+ for large runs