DNA methylation — primarily 5-methylcytosine (5mC) at CpG dinucleotides — is the most extensively studied epigenetic mark, controlling gene expression, genomic imprinting, X-chromosome inactivation, and transposable element silencing. For decades, bisulfite conversion followed by short-read sequencing (WGBS) or array hybridization (EPIC/450K) has been the standard approach for methylation profiling. But these methods have a fundamental limitation: short reads (150–300 bp) cannot resolve which allele a methylation call belongs to, cannot span repetitive regions, cannot simultaneously detect structural variants, and cannot distinguish 5mC from 5-hydroxymethylcytosine (5hmC) without additional chemical steps.
Long-read DNA methylation sequencing solves these problems at the molecular level. Oxford Nanopore Technologies (ONT) and PacBio Single-Molecule Real-Time (SMRT) sequencing detect modified bases directly from native DNA — no bisulfite conversion, no PCR amplification, no chemical damage. Because individual reads span tens of kilobases (ONT routinely exceeds 100 kb, PacBio HiFi delivers 15–25 kb), every CpG call is traceable to a specific DNA molecule, a specific haplotype, and — when heterozygous variants are present — a specific parental allele.
At CD Genomics, we offer long-read methylation sequencing on both ONT PromethION and PacBio Revio platforms, with end-to-end support from HMW DNA extraction through haplotype-resolved methylation analysis.
Key Highlights:
DNA methylation at CpG dinucleotides is a fundamental epigenetic mechanism controlling gene expression, genomic imprinting, X-chromosome inactivation, and transposon silencing. For decades, methylation analysis has relied on bisulfite conversion — a chemical treatment that degrades DNA, cannot distinguish 5-methylcytosine (5mC) from 5-hydroxymethylcytosine (5hmC), and yields short reads (150–300 bp) incapable of phasing methylation to individual alleles or spanning repetitive genomic regions.
Long-read sequencing on Oxford Nanopore (ONT) and PacBio platforms eliminates these constraints by detecting modified bases directly from native DNA molecules that routinely span tens of kilobases. This page outlines our end-to-end long-read methylation sequencing service — from HMW DNA requirements and platform selection strategy through bioinformatics deliverables, haplotype-resolved methylation analysis, and published case evidence demonstrating the value of single-molecule, allele-resolved methylation detection.
Both ONT and PacBio long-read platforms detect DNA methylation directly from native double-stranded DNA, without the chemical conversion step that has defined methylation analysis since the 1990s. This matters because bisulfite treatment converts unmethylated cytosine to uracil under harsh acidic conditions that degrade DNA, introduce GC bias, and erase the distinction between 5mC and 5hmC — both read as "C" after bisulfite unless additional chemical steps (oxBS, TAB-seq) are applied.
Native detection avoids all of these artifacts. Both platforms share a common principle — modified bases are identified by their physical signature on individual DNA molecules — but differ in how that signature is read: ONT measures ionic current disruption as DNA passes through a protein pore, while PacBio measures altered polymerase kinetics during synthesis in a zero-mode waveguide. The illustration below compares both detection principles at the molecular level.
In Nanopore sequencing, a DNA molecule passes through a protein nanopore embedded in a membrane. An ionic current flows through the pore; as each base (or modified base) transits the narrowest constriction, it disrupts the current in a characteristic way. 5-Methylcytosine and 5-hydroxymethylcytosine alter the current signal differently from unmodified cytosine — and from each other.
A neural network basecaller (Dorado with the 5mC/5hmC model, or Remora/Megalodon for earlier workflows) analyzes the raw current trace in ~5-mer to ~9-mer context windows, assigning each base a modification probability. The output is a standard BAM file with MM/ML tags specifying per-base modification calls and their confidence scores. Tools like Modkit convert these to methylation frequency bedGraph tracks and phased methylation haplotypes.
Key advantages of ONT methylation detection:
PacBio SMRT sequencing detects modified bases through polymerase kinetics. During sequencing, a DNA polymerase incorporates fluorescently labeled nucleotides into a complementary strand in a zero-mode waveguide (ZMW). The time interval between successive incorporations — the inter-pulse duration (IPD) — slows measurably when the polymerase encounters a modified base. 5mC and 5hmC each leave a characteristic IPD and pulse-width (PW) signature.
The PacBio Revio system combines circular consensus sequencing (CCS) with kinetic analysis: the same molecule is read multiple times (typically 10–15 passes for HiFi), producing both a high-accuracy consensus sequence (Q30+) and aggregated kinetic information that reveals modification status. The pb-cpg-tools pipeline calls 5mC at CpG sites genome-wide using a 400-bp context window. In 2025, a CUHK team introduced the HK2 model enabling 5hmC discrimination from PacBio HiFi kinetics.
Key advantages of PacBio HiFi methylation detection:
Both platforms produce accurate, haplotype-resolved methylation calls. The choice between them depends on your specific question — read length, modification type, coverage budget, and whether sequence variant accuracy or throughput is the priority.
| Criterion | Oxford Nanopore (PromethION) | PacBio HiFi (Revio) |
|---|---|---|
| Read length | 50–200+ kb (routine); up to 4 Mb demonstrated | 15–25 kb (HiFi CCS) |
| Base accuracy (sequence) | Q20+ (99%+) with latest chemistry and models | Q30+ (99.9%+) HiFi consensus |
| 5mC detection | Yes — raw current signal (Dorado 5mC model) | Yes — IPD kinetics (pb-cpg-tools) |
| 5hmC detection | Yes — discriminated from 5mC by current signal | Yes — HK2 model (2025); previously challenging |
| 6mA detection | Yes — directly from signal | Yes — via Fiber-seq (EcoGII labeling) or SMRT kinetics |
| 4mC detection | Yes | Yes — prokaryotic contexts |
| Coverage recommendation (human WGS) | 15–30× for DMR detection; 5× sufficient for population-scale methylation surveys | 15–30× for DMR detection + SV/SNV integration |
| Targeted enrichment | Adaptive sampling (ReadUntil) — software-defined, no extra library prep | Probe-based capture (myBaits) or amplicon-based |
| Throughput per flow cell | PromethION: ~150–200 Gb per flow cell (~5–6 human genomes at 30×) | Revio: ~90–120 Gb per SMRT Cell 25M (~2–3 human genomes at 15×) |
| Cost per sample (human WGS) | $800–1,500 at 30× | $1,500–3,000 at 15–30× |
| Best for | Ultra-long phasing; imprinting and centromeric studies; targeted adaptive sampling; population surveys at moderate coverage; labs needing 5hmC discrimination on the same platform | High-accuracy variant + methylation integration; de novo assembly + methylation; labs requiring gold-standard SNV calling alongside methylation; prokaryotic methylome analysis |
Selection Strategy:
The defining advantage of long-read methylation sequencing is the ability to assign every methylation call to a specific DNA molecule — and therefore to a specific haplotype or parental allele. Short-read bisulfite sequencing produces a population-average methylation level at each CpG (beta value = M/(M+U)), but cannot tell you whether 50% methylation means both alleles are half-methylated or one allele is fully methylated and the other is fully unmethylated. Biologically, these are entirely different regulatory states.
Long reads that span heterozygous variants (SNVs or indels) carry both the variant allele and the methylation status of every CpG on that same molecule. Phasing tools — MethPhaser (ONT), HiPhase (PacBio), or Whatshap — use these variant-phased reads to sort methylation calls into two haplotypes, producing a genome-wide haplotype-resolved methylome.
The biological insights enabled by haplotype-resolved methylation include:
Beyond allele-level phasing, long reads reveal co-methylation patterns — whether adjacent CpGs on the same molecule tend to be methylated together or independently. Metrics such as single-molecule methylation entropy, Proportion of Discordant Reads (PDR), and Methylation Haplotype Load (MHL) quantify these patterns, which reflect chromatin state heterogeneity within a cell population. A 2023 Communications Biology study by Magi et al. demonstrated that stochastic co-methylation disruption at CpG-poor regions — not genetic mutation — drives chemotherapy resistance in relapsed AML, an insight inaccessible to short-read methylation analysis.
Our long-read methylation sequencing service follows a standardized workflow with QC checkpoints at each stage, from HMW DNA receipt through methylation analysis and data delivery.
1. Sample Receipt and HMW DNA QC — QC Checkpoint: DNA quantified by fluorometry (Qubit); purity verified by spectrophotometry (OD260/280 = 1.8–2.0, OD260/230 ≥ 2.0); fragment size distribution assessed by pulsed-field gel electrophoresis or FEMTO Pulse; samples failing size or purity thresholds are flagged and the client is contacted before proceeding.
2. Library Preparation — ONT: DNA end-repair, A-tailing, and motor protein adapter ligation; QC Checkpoint: library yield and fragment size. PacBio: DNA shearing to target size, SMRTbell adapter ligation; QC Checkpoint: library concentration, SMRTbell integrity, and binding efficiency. Platform selection is confirmed with the client at this stage.
3. Sequencing — ONT PromethION flow cells or PacBio Revio SMRT Cells 25M. Run metrics monitored in real time: ONT pore occupancy, PacBio ZMW loading and P1 yield. QC Checkpoint: real-time throughput tracking; run extended or repeated if coverage target is not reached.
4. Base Modification Calling — ONT: raw POD5 signal processed through Dorado basecaller with 5mC/5hmC model, outputting aligned BAM with MM/ML modification tags. PacBio: CCS analysis followed by pb-cpg-tools for per-CpG methylation frequency calling. QC Checkpoint: modification call confidence score distribution; alignment rate; coverage uniformity assessment.
5. Methylation Analysis and Data Delivery — Methylation frequency tracks generated (bedGraph/BigWig); DMRs and allele-specific methylation called; haplotype phasing performed; deliverables packaged and reviewed. QC Checkpoint: deliverable completeness checklist; internal peer review of DMR calls and QC metrics before client release.
Long-read methylation sequencing can be applied at three scales depending on your research question and sequencing budget.
Comprehensive 5mC profiling across the entire genome at 15–30× coverage. Simultaneously detects SNVs, SVs, and methylation from the same reads. Recommended when the full methylation landscape is unknown, when repetitive and intergenic regions matter, or when multi-omics integration is the goal. Available on both ONT and PacBio platforms.
Restriction enzyme-based reduced representation methylation sequencing on the ONT platform, targeting ~310 Mb of CpG islands, promoters, and gene bodies. Substantially lower cost per sample than WGS while retaining long-read phasing at regulatory regions. Suited for population-scale studies, screening applications, and projects where gene-regulatory methylation is the primary interest.
Enrichment for specific genomic loci — disease-associated genes, imprinting centers, repeat expansion loci (C9orf72, FMR1, HTT), or custom panels. ONT adaptive sampling (ReadUntil) enables software-defined enrichment without additional library preparation. PacBio targeted methylation uses probe-based capture (myBaits). Ideal for clinical research applications where depth at specific loci is prioritized over genome-wide coverage.
Long-read methylation sequencing requires high-molecular-weight (HMW) DNA. The read length you obtain — and therefore the phasing distance you can achieve — is directly limited by input DNA fragment length.
| Requirement | ONT PromethION | PacBio Revio |
|---|---|---|
| Input DNA amount | 1–3 μg HMW DNA (standard WGS); ~500 ng for low-input protocols | 3–5 μg HMW DNA (standard); ~1 μg for low-input |
| DNA fragment size | >30 kb recommended; >50 kb for optimal phasing; >100 kb for ultra-long | >20 kb recommended; >30 kb for optimal HiFi library yield |
| Purity | OD260/280 = 1.8–2.0; OD260/230 ≥ 2.0; no residual phenol, ethanol, or chelating agents | |
| Sample types accepted | Fresh or frozen tissue; cultured cells; blood (buffy coat or PBMCs); flash-frozen biopsies. FFPE DNA is generally not suitable for long-read sequencing due to fragmentation. | |
Coverage Recommendations by Application:
| Application | Recommended Coverage | Notes |
|---|---|---|
| Population-scale methylation survey | 5× | Sufficient for global methylation pattern comparison; phasing possible at heterozygous loci |
| CpG island / DMR detection | 10–15× | Reliable per-CpG methylation frequency estimates |
| Haplotype-resolved methylation + SV calling | 15–30× | Required for confident allele-specific methylation calling at individual heterozygous sites |
| Clinical research (pathogenic hypermethylation detection) | 20–30× | Necessary to call rare hypermethylation outliers with high sensitivity, per Cheung et al. (2023) Nature Communications |
| De novo assembly + methylation | 30–60× (HiFi) | Required for telomere-to-telomere reference-grade epigenomes |
Shipping: HMW DNA in TE buffer (pH 8.0) or EB, shipped on cold packs or dry ice. Avoid vortexing, repeated pipetting, and freeze-thaw cycles — HMW DNA is shear-sensitive. Tissue and cell samples shipped on dry ice; contact us for tissue-specific preservation recommendations.
Long-read methylation data analysis involves raw signal processing, base modification calling, alignment, methylation frequency quantification, DMR detection, and phasing. Our pipeline uses community-standard tools with fully documented parameters and versions.
Standard Deliverables:
| Deliverable | Description |
|---|---|
| Raw signal data | POD5 (ONT) or BAM with kinetics tags (PacBio); base FASTQ files |
| Aligned reads with modification tags | BAM files with MM/ML tags encoding per-base methylation probabilities |
| Methylation frequency tracks | bedGraph or BigWig files — per-CpG methylation fraction genome-wide |
| Haplotype-phased methylation calls | Two-track methylation bedGraph (Hap1/Hap2) or MethPhaser/HiPhase phased output |
| Methylation QC report | Per-read and per-site methylation statistics; coverage distribution; conversion efficiency metrics |
| DMR / differentially methylated region list | Statistically significant DMRs between condition groups with genomic annotation |
| Allele-specific methylation (ASM) table | Heterozygous SNP sites with haplotype-specific methylation ratios and statistical confidence |
| Functional enrichment analysis | GO/KEGG enrichment of genes associated with DMRs or ASM sites |
| Genome browser compatible tracks | BigWig files for IGV/UCSC visualization; phased BAM files with haplotype tags |
Optional Advanced Analysis:
The composite image below illustrates the data types delivered with each long-read methylation project.
Haplotype-Resolved Methylation and Single-Molecule Views:
Differential Analysis and Multi-Omics Integration:
All demo results are generated from representative datasets and reflect the standard analysis depth delivered with each project.
Long-read DNA methylation sequencing supports the study of promoter hypermethylation, allele-specific methylation, loss of heterozygosity, structural variation, and tumor epigenome heterogeneity. By linking methylation signals with SNVs, SVs, copy-number changes, and haplotypes on the same DNA molecule, it helps researchers explore cancer-associated regulatory mechanisms and candidate epigenetic biomarkers.
For disease-focused research cohorts, long-read methylation sequencing can be used to investigate hypermethylation outliers, allele-specific methylation events, imprinting-related changes, and non-coding regulatory mechanisms. This approach is especially useful when researchers need to connect methylation patterns with nearby sequence variants, structural variants, or haplotype backgrounds.
Long-read sequencing can span repeat-rich and GC-rich regions that are difficult to resolve with short-read methods, such as loci related to C9orf72, FMR1, HTT, and other repeat-containing genes. It enables simultaneous analysis of repeat structure, methylation status, and flanking haplotypes for neurological disease mechanism studies.
Long-read methylation sequencing enables phased analysis of imprinted regions and other allele-specific regulatory elements. It supports research into parent-of-origin methylation patterns, imprinting control regions, X-chromosome inactivation, uniparental disomy-related mechanisms, and haplotype-resolved epigenetic regulation.
Long reads expand methylation analysis into repetitive and structurally complex genomic regions, including segmental duplications, centromeric satellites, acrocentric chromosome regions, and other regions that are difficult to map with short reads. This makes the method suitable for population-scale methylation surveys, pangenome-related epigenomic research, and studies of methylation diversity across individuals, populations, or species.
In plants and agricultural species, long-read methylation sequencing supports the analysis of CpG, CHG, and CHH methylation across genes, transposable elements, and repetitive regions. It can be applied to crop epigenome mapping, stress-response studies, transposon regulation research, allele-specific methylation analysis, and epigenome-wide association studies.
Background
Cancer genomes harbor a complex interplay of somatic mutations, copy-number alterations, structural variants (SVs), and DNA methylation changes — but short-read sequencing cannot resolve these features simultaneously on the same DNA molecule. O'Neill et al. set out to determine whether Oxford Nanopore long-read whole-genome sequencing could jointly characterize SVs, haplotype phasing, and native DNA methylation in a large, diverse cancer cohort, and whether this integrated view would reveal clinically relevant epigenetic alterations missed by standard diagnostic sequencing.
Methods
ONT PromethION WGS was performed on 189 tumor and 41 matched normal samples representing 26 cancer types. HMW DNA was extracted from fresh-frozen tissue. Coverage ranged from 10–60× per sample. Methylation was called using Megalodon/Dorado from raw nanopore current signal. Somatic SVs were called and phased jointly with germline variants using Whatshap and custom phasing pipelines. Methylation was phased to the two haplotypes where heterozygous SNPs were available to distinguish alleles. Allelically differentially methylated regions (aDMRs) were identified genome-wide and screened for recurrence across cancer types.
Results
The study identified ~4.46 million allele-specific differentially methylated regions (aDMRs) across the cohort — methylation differences between the two alleles at the same locus — including recurrent aDMRs at RET and CDKN2A. Somatic BRCA1 and RAD51C promoter hypermethylation were identified as likely drivers of homologous recombination deficiency in tumors lacking coding mutations in these genes, and germline MLH1 promoter hypermethylation was directly observed in a Lynch syndrome patient. These epigenetic driver events were invisible to standard panel-based and short-read WGS diagnostics. The phased methylation data also enabled LOH-aware methylation analysis — distinguishing true biallelic hypermethylation from apparent hypermethylation caused by loss of the unmethylated allele.
Conclusion
This study establishes ONT long-read WGS as a single-assay platform capable of simultaneously detecting SNVs, SVs, copy-number changes, and native DNA methylation with haplotype-level resolution — and demonstrates that this integrated view identifies clinically actionable epigenetic alterations missed by conventional cancer diagnostics. For cancer researchers and clinical genomics programs, long-read methylation fills the gap between sequence-level and epigenome-level driver discovery.
Terms & Conditions Privacy Policy Copyright © CD Genomics. All rights reserved.
Quote Request