Long-read sequencing platforms & applications

Jay Hesselberth

RNA Bioscience Initiative | CU Anschutz

2024-10-21

Short & long read DNA sequencing

See the forest

for the trees

We have entered a new era (Swifties unite) wherein routine application of long-read sequencing is revealing new principles of gene regulation.

Comparing data from short- and long-read experiments

Things that are invisible to short-read sequencing

Genomic features greater 500 bp in size (i.e., the size of an Illumina fragment):

  • Patterns of pre-mRNA splicing (which exons / introns in a molecule)
  • Structural genomic variants
  • Chromatin structure across a single locus.

Patterns we have examined in class rely on patterns that emerge from many, small DNA or RNA fragments.

Long-read experiments examine single molecules and can capture patterns captured in single DNA fragments.

Major platforms — Pacific Biosciences

  • >25 kb reads
  • Low error rates achieved by “circular consensus”
  • Only DNA (RNA used to be available 😢)

Major platforms - Oxford Nanopore

  • >150 kb reads
  • higher error rates (0.1-1%)
  • detection of modified bases (5mC, RNA mods)

DNA case study – T2T

The T2T (telomere-to-telomere) consortium used long-read sequencing to define the first complete draft of a human genome, including telomeric and centromeric repeats.

DNA case study - Fiber-seq

Stergachis et al. (2020) Nature

RNA case study – splicing patterns

RNA case study – co-transcriptional communication

RNA case study – co-transcriptional communication

Applications beyond long-read nucleic acid sequencing

We might be able to sequence proteins soon. Why is this cool?