Overview
This vignette demonstrates clover’s analysis workflow using E.
coli tRNA sequencing data from a T4 phage infection time-course
experiment. The test dataset includes 6 samples: 3 uninfected controls
(ctl) and 3 T4-infected (inf) wild-type E.
coli at 15 minutes post-infection, with 3 biological replicates per
condition.
Loading data with create_clover()
The simplest way to load results from the aa-tRNA-seq-pipeline
is with create_clover(), which reads a pipeline
configuration file and assembles a
SummarizedExperiment.
config_path <- clover_example("ecoli/config.yaml")
sample_info <- data.frame(
sample_id = c(
"wt-15-ctl-01",
"wt-15-ctl-02",
"wt-15-ctl-03",
"wt-15-inf-01",
"wt-15-inf-02",
"wt-15-inf-03"
),
condition = rep(c("ctl", "inf"), each = 3),
replicate = rep(1:3, 2)
)
se <- create_clover(
config_path,
types = c("charging", "bcerror", "odds_ratios"),
sample_info = sample_info
)
se
#> class: SummarizedExperiment
#> dim: 190 6
#> metadata(5): config charging bcerror odds_ratios fasta
#> assays(1): counts
#> rownames(190): host-tRNA-Ala-GGC-1-1 host-tRNA-Ala-GGC-1-1-uncharged
#> ... phage-tRNA-Thr-TGT phage-tRNA-Thr-TGT-uncharged
#> rowData names(2): ref seq_length
#> colnames(6): wt-15-ctl-01 wt-15-ctl-02 ... wt-15-inf-02 wt-15-inf-03
#> colData names(3): sample_id condition replicateThe SE object contains:
- assay “counts”: abundance count matrix (charged + uncharged reads)
- colData: sample metadata with condition labels
- metadata: list with raw charging data, bcerror, odds ratios, FASTA reference, and pipeline config
assay(se, "counts")[1:5, ]
#> wt-15-ctl-01 wt-15-ctl-02 wt-15-ctl-03
#> host-tRNA-Ala-GGC-1-1 298 397 113
#> host-tRNA-Ala-GGC-1-1-uncharged 4393 5014 883
#> host-tRNA-Ala-GGC-1-2 1625 1506 265
#> host-tRNA-Ala-GGC-1-2-uncharged 4391 5130 862
#> host-tRNA-Ala-TGC-1-1 518 610 160
#> wt-15-inf-01 wt-15-inf-02 wt-15-inf-03
#> host-tRNA-Ala-GGC-1-1 189 347 1253
#> host-tRNA-Ala-GGC-1-1-uncharged 3640 7133 12837
#> host-tRNA-Ala-GGC-1-2 1299 1857 4409
#> host-tRNA-Ala-GGC-1-2-uncharged 3637 7411 12592
#> host-tRNA-Ala-TGC-1-1 233 553 1706
colData(se)
#> DataFrame with 6 rows and 3 columns
#> sample_id condition replicate
#> <character> <character> <integer>
#> wt-15-ctl-01 wt-15-ctl-01 ctl 1
#> wt-15-ctl-02 wt-15-ctl-02 ctl 2
#> wt-15-ctl-03 wt-15-ctl-03 ctl 3
#> wt-15-inf-01 wt-15-inf-01 inf 1
#> wt-15-inf-02 wt-15-inf-02 inf 2
#> wt-15-inf-03 wt-15-inf-03 inf 3Differential tRNA abundance
We can test for differential tRNA abundance between infected and uninfected conditions using the DESeq2 wrappers.
counts <- assay(se, "counts")
coldata <- as.data.frame(colData(se))
dds <- run_deseq(counts, coldata, design = ~condition)
#> estimating size factors
#> estimating dispersions
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates
#> fitting model and testing
res <- tidy_deseq_results(dds, contrast = c("condition", "inf", "ctl"))
plot_volcano(res)
Volcano plot of differential tRNA abundance (inf vs ctl).
Differential charging analysis
A unique feature of nanopore tRNA-seq is the ability to measure charging levels. We can test whether the ratio of charged to uncharged reads changes upon infection.
charging <- metadata(se)$charging
charging$condition <- ifelse(grepl("ctl", charging$sample_id), "ctl", "inf")
ratio_diff <- compute_charging_diffs(
charging,
numerator = "inf",
denominator = "ctl",
min_count = 50,
n_top = 20
)
plot_charging_diffs(ratio_diff)
Change in charging ratio (infected - control) per tRNA.
Abundance versus charging
We can visualize the relationship between abundance changes and charging ratio changes on a single scatter plot. This highlights tRNAs where expression and aminoacylation are coordinately or discordantly affected.
plot_abundance_charging(res, ratio_diff)
Abundance change versus charging ratio change per tRNA.
Base-calling error profiles
Base-calling error rates reflect RNA modifications that cause the nanopore basecaller to misidentify bases. Comparing error profiles between conditions can reveal modification changes.
# Load preprocessed bcerror data comparing wt vs TruB-del (tb) for 10 tRNAs
bcerror_rds <- readRDS(clover_example("ecoli/bcerror_summary.rds"))
bcerror_charged <- bcerror_rds$bcerror_charged
bcerror_summary <- bcerror_rds$bcerror_summary
# Plot error profiles for a few tRNAs
plot_trnas <- c(
"host-tRNA-Asn-GTT-1-1",
"host-tRNA-Glu-TTC-1-1",
"host-tRNA-Phe-GAA-1-1"
)
plot_bcerror_profile(bcerror_summary, refs = plot_trnas)
Base-calling error profiles for selected tRNAs.
Error rate difference heatmap
# Compute delta (wt - tb); positive = higher error in wt (modification present)
bcerror_delta <- compute_bcerror_delta(bcerror_summary, delta = wt - tb)
# Add Sprinzl coordinates and MODOMICS modification annotations
sprinzl <- read_sprinzl_coords(
clover_example("sprinzl/ecoliK12_global_coords.tsv.gz")
)
trna_fasta <- clover_example("ecoli/trna_only.fa.gz")
mods <- modomics_mods(trna_fasta, organism = "Escherichia coli")
heatmap_data <- prep_mod_heatmap(
bcerror_delta,
sprinzl_coords = sprinzl,
mods = mods
)
plot_mod_heatmap(
heatmap_data,
value_col = "delta",
ref_col = "trna_label",
cluster = FALSE,
color_limits = c(-0.1, 0.1),
highlight_col = "has_mod"
)
Difference in mean base-calling error (wt - TruB-del) across all tRNAs and positions.
Modification annotations from MODOMICS
The MODOMICS database
catalogs known RNA modifications. We already used
modomics_mods() above to annotate the heatmap with known
modification positions. This function maps MODOMICS entries onto your
reference sequences using pairwise alignment. Data for common organisms
is bundled with the package (see modomics_organisms()), so
no internet connection is needed.
mods
#> # A tibble: 698 × 4
#> ref pos mod_full mod1
#> <chr> <int> <chr> <chr>
#> 1 host-tRNA-Ala-TGC-1-1 17 dihydrouridine D
#> 2 host-tRNA-Ala-TGC-1-1 34 uridine 5-oxyacetic acid cmo5U
#> 3 host-tRNA-Ala-TGC-1-1 46 7-methylguanosine m7G
#> 4 host-tRNA-Ala-TGC-1-1 54 5-methyluridine m5U
#> 5 host-tRNA-Ala-TGC-1-1 55 pseudouridine Y
#> 6 host-tRNA-Ala-GGC-1-1 17 dihydrouridine D
#> 7 host-tRNA-Ala-GGC-1-1 46 7-methylguanosine m7G
#> 8 host-tRNA-Ala-GGC-1-1 54 5-methyluridine m5U
#> 9 host-tRNA-Ala-GGC-1-1 55 pseudouridine Y
#> 10 host-tRNA-Arg-ACG-1-1 8 4-thiouridine s4U
#> # ℹ 688 more rowsWe can overlay known modifications onto the base-calling error profiles. Positions with known modifications often correspond to elevated error rates, since the basecaller misidentifies modified nucleotides.
plot_bcerror_profile(bcerror_summary, refs = plot_trnas, mods = mods)
Base-calling error profiles with known modification positions marked.
tRNA secondary structure visualization
clover includes pre-computed tRNA cloverleaf SVGs for several model organisms. You can overlay modification annotations, circle outlines, and text color changes onto these structures.
# List available organisms and tRNAs
structure_organisms()
#> [1] "Escherichia coli" "Homo sapiens"
#> [3] "Saccharomyces cerevisiae"
head(structure_trnas("Escherichia coli"))
#> [1] "tRNA-Ala-GGC" "tRNA-Ala-TGC" "tRNA-Arg-ACG" "tRNA-Arg-CCG" "tRNA-Arg-CCT"
#> [6] "tRNA-Arg-TCT"Base structure
The simplest call renders the bare cloverleaf with base-pair lines and nucleotide letters.
svg_path <- plot_tRNA_structure("tRNA-Glu-TTC", "Escherichia coli")Annotated structure with modifications, outlines, and text colors
We can highlight the anticodon with filled circles, mark the discriminator base with an outline, and change the text color of specific positions.
# Known modifications for this tRNA
mods_glu_struct <- mods |>
filter(ref == "host-tRNA-Glu-TTC-1-1")
# Look up sequence positions for Sprinzl anticodon (34-36) and
# discriminator base (73)
glu_coords <- sprinzl |>
filter(trna_id == "tRNA-Glu-UUC-1-1")
ac_pos <- glu_coords |>
filter(sprinzl_label %in% c("34", "35", "36")) |>
pull(pos)
disc_pos <- glu_coords |>
filter(sprinzl_label == "73") |>
pull(pos)
# Anticodon positions as filled circles
anticodon <- tibble::tibble(
pos = ac_pos,
mod1 = rep("anticodon", 3)
)
# Combine MODOMICS mods with anticodon highlight
all_mods <- bind_rows(mods_glu_struct, anticodon)
# Discriminator base as an outline
disc_outline <- tibble::tibble(pos = disc_pos, group = "discriminator")
# Change discriminator text to red
disc_text <- tibble::tibble(pos = disc_pos, color = "#E41A1C")
svg_path <- plot_tRNA_structure(
"tRNA-Glu-TTC",
"Escherichia coli",
modifications = all_mods,
mod_palette = c(default_mod_palette(), anticodon = "#4DAF4A"),
outlines = disc_outline,
outline_palette = c(discriminator = "#E41A1C"),
text_colors = disc_text
)Long variable arm tRNAs
Leu and Ser tRNAs have a 4-stem cloverleaf with a variable arm stem, which is properly rendered from the tRNAscan-SE covariance model alignment.
svg_path <- plot_tRNA_structure("tRNA-Leu-CAA", "Escherichia coli")Modification co-occurrence on structure
We can visualize pairwise modification co-occurrence directly on the tRNA cloverleaf. Arcs connect position pairs with significant odds ratios: blue arcs indicate mutual exclusivity (negative log OR) and vermillion arcs indicate co-occurrence (positive log OR). Stroke width encodes the magnitude.
or_data <- metadata(se)$odds_ratios
# Filter OR data for one tRNA and one sample, clean, and filter for
# significance. filter_linkages() returns a tibble ready for plotting.
linkages_glu <- or_data |>
filter(
ref == "host-tRNA-Glu-TTC-1-1",
sample_id == "wt-15-ctl-01"
) |>
clean_odds_ratios() |>
filter_linkages()
svg_path <- plot_tRNA_structure(
"tRNA-Glu-TTC",
"Escherichia coli",
modifications = mods_glu_struct,
linkages = linkages_glu
)Next steps
For isodecoder-level modification rewiring analysis — including odds
ratio aggregation, ratio of odds ratios, dimensionality reduction, and
network visualization — see
vignette("rewiring", package = "clover").
Session info
sessionInfo()
#> R version 4.5.2 (2025-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] tidyr_1.3.2 dplyr_1.2.0
#> [3] SummarizedExperiment_1.40.0 Biobase_2.70.0
#> [5] GenomicRanges_1.62.1 Seqinfo_1.0.0
#> [7] IRanges_2.44.0 S4Vectors_0.48.0
#> [9] BiocGenerics_0.56.0 generics_0.1.4
#> [11] MatrixGenerics_1.22.0 matrixStats_1.5.0
#> [13] clover_0.0.0.9000
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.2.1 farver_2.1.2 Biostrings_2.78.0
#> [4] S7_0.2.1 fastmap_1.2.0 digest_0.6.39
#> [7] lifecycle_1.0.5 pwalign_1.6.0 magrittr_2.0.4
#> [10] compiler_4.5.2 rlang_1.1.7 sass_0.4.10
#> [13] tools_4.5.2 utf8_1.2.6 yaml_2.3.12
#> [16] gt_1.3.0 knitr_1.51 S4Arrays_1.10.1
#> [19] labeling_0.4.3 htmlwidgets_1.6.4 bit_4.6.0
#> [22] DelayedArray_0.36.0 xml2_1.5.2 RColorBrewer_1.1-3
#> [25] abind_1.4-8 BiocParallel_1.44.0 withr_3.0.2
#> [28] purrr_1.2.1 desc_1.4.3 grid_4.5.2
#> [31] ggplot2_4.0.2 scales_1.4.0 cli_3.6.5
#> [34] rmarkdown_2.30 crayon_1.5.3 ragg_1.5.0
#> [37] tzdb_0.5.0 commonmark_2.0.0 cachem_1.1.0
#> [40] stringr_1.6.0 parallel_4.5.2 XVector_0.50.0
#> [43] vctrs_0.7.1 Matrix_1.7-4 jsonlite_2.0.0
#> [46] litedown_0.9 hms_1.1.4 bit64_4.6.0-1
#> [49] ggrepel_0.9.6 systemfonts_1.3.1 locfit_1.5-9.12
#> [52] jquerylib_0.1.4 glue_1.8.0 reactR_0.6.1
#> [55] pkgdown_2.2.0 codetools_0.2-20 ggtext_0.1.2
#> [58] cowplot_1.2.0 stringi_1.8.7 gtable_0.3.6
#> [61] tibble_3.3.1 pillar_1.11.1 htmltools_0.5.9
#> [64] reactable_0.4.5 R6_2.6.1 textshaping_1.0.4
#> [67] vroom_1.7.0 evaluate_1.0.5 lattice_0.22-7
#> [70] markdown_2.0 readr_2.2.0 gridtext_0.1.6
#> [73] bslib_0.10.0 Rcpp_1.1.1 SparseArray_1.10.8
#> [76] DESeq2_1.50.2 xfun_0.56 fs_1.6.6
#> [79] forcats_1.0.1 pkgconfig_2.0.3