DNA methylation and genetic variation are two fundamental layers of genomic information, yet they have traditionally required separate assays — WGBS (or EM-seq) for methylation and WGS for variants. Running parallel workflows doubles the cost, consumes twice the precious sample, and makes it impossible to phase methylation status to heterozygous variants on the same DNA molecule.
Illumina 5-base sequencing changes this paradigm. Using enzymatic 5mC→T conversion chemistry, 5-base detects every cytosine in its original state — methylated (5mC) as thymine, unmethylated as cytosine — producing a true five-base readout (A, T, G, C, and 5mC) without bisulfite damage. From a single library preparation and one sequencing run, you obtain both genome-wide methylation profiles and high-confidence variant calls (SNV, Indel, CNV, SV) on the same DNA molecules.
The ability to detect both DNA methylation and genetic variation from a single DNA sample represents a major advance in genomic analysis. In cancer liquid biopsy, for instance, circulating tumor DNA (ctDNA) carries both methylation signatures (tissue-of-origin, early detection) and somatic mutations (clonal hematopoiesis, treatment response). Analyzing these two signal types from separate aliquots of the same cfDNA sample divides an already limited material and introduces technical variability between assays. The problem is amplified with FFPE samples, where DNA is fragmented and damaged before any processing begins.
5-base sequencing solves this through a fundamentally different approach to cytosine conversion. Rather than using sodium bisulfite — which converts both 5mC and unmethylated C to uracil, destroying >80% of DNA and confounding methylation and variant signals — 5-base employs a targeted enzymatic reaction that selectively converts 5-methylcytosine (5mC) to thymine while leaving unmethylated cytosine unchanged. The result is a sequencing library that retains full nucleotide diversity (A, T, G, C), enabling simultaneous methylation quantification and accurate variant detection from every read.
Our service provides end-to-end support: sample QC and library construction using the validated 5-base enzymatic conversion protocol, Illumina sequencing at your required depth, and dual-omic bioinformatics analysis through the DRAGEN pipeline. Deliverables include CpG methylation calls (beta values, coverage tracks), whole-genome variant calls (SNV, Indel, CNV, SV), and optional fragmentomic analysis for cfDNA applications.
5-base sequencing achieves dual-omic detection through a two-step enzymatic process that converts the methylation signal into a sequencing-readable form while preserving the underlying genetic sequence.
1) Enzymatic 5mC→T conversion — A proprietary enzymatic cocktail specifically oxidizes and converts 5-methylcytosine (5mC) to thymine through a series of controlled reactions. The key distinction from bisulfite conversion is specificity: unmethylated cytosines are not converted. This means the original C nucleotides in the genome remain as C in the sequencing read, preserving the full A/T/G/C nucleotide diversity needed for accurate variant calling. The reaction is gentle — it does not fragment DNA or cause the massive degradation (>80%) seen with bisulfite treatment.
2) 5-base sequencing — The converted library is sequenced on an Illumina platform, producing reads where each position reports one of five states: A, T, G, C, or methylated C (read as T). The sequencing chemistry and data output are identical to standard Illumina sequencing — no special protocols or base calling modifications are required. The “5-base” name reflects that the system effectively distinguishes five nucleotide states from the same DNA molecule.
3) DRAGEN dual-omic separation — After sequencing, the DRAGEN bioinformatics pipeline separates the methylation and variant signals from the same raw reads. Methylation is quantified at CpG sites by comparing converted (T) vs unconverted (C) reads at each cytosine position, producing per-site beta values. Variants are called by aligning reads to the reference genome and identifying positions where the base in any read differs from the reference, with the methylation status of each allele tracked independently.
5-base sequencing can be configured for different research goals, sample types, and budgets.
| Service Option | Coverage | Sample Input | Applications |
|---|---|---|---|
| 5-Base WGS 30× | 30× whole genome | 50–100 ng gDNA / 10–20 ng cfDNA | Comprehensive methylation + variant profiling; cancer genomics; rare disease; allele-specific methylation |
| 5-Base WGS 15× | 15× whole genome | 50–100 ng gDNA / 10–20 ng cfDNA | Methylation profiling with adequate variant detection; population studies; cost-sensitive projects |
| 5-Base Low-Pass WGS | 5–10× | 1–10 ng cfDNA | cfDNA methylation fragmentomics; liquid biopsy biomarker discovery; MRD assay development |
| 5-Base Targeted Enrichment | 500×+ (target) | 50–100 ng gDNA | Focused methylation + variant analysis at specific gene panels or genomic regions |
| 5-Base FFPE Protocol | Custom | 50–100 ng FFPE DNA | Clinical FFPE samples requiring both methylation and mutation data from degraded DNA |
For most applications, we recommend 5-Base WGS 30× as the starting point — it provides the depth needed for robust CpG methylation quantification (≥10× per CpG) while delivering standard WGS-grade variant detection across the genome.
Request a Service RecommendationOur 5-base sequencing service follows a streamlined workflow with QC checkpoints at each stage.
1. Sample QC and DNA Input — QC Checkpoint: DNA quantity assessed by fluorometry (Qubit), quality assessed by spectrophotometry (OD260/280: 1.8–2.0) and gel electrophoresis or Fragment Analyzer. For cfDNA: fragment size distribution verified (predominant peak at ~166 bp). For FFPE: DNA integrity assessed. Minimum input verified against selected service option.
2. Enzymatic 5mC→T Conversion — DNA undergoes the proprietary enzymatic conversion reaction. QC Checkpoint: conversion efficiency verified using spike-in controls (fully methylated and unmethylated lambda DNA). Target: >97% 5mC conversion, <1% unmethylated C conversion.
3. Library Construction — End repair, A-tailing, and adapter ligation with unique molecular indexes. Low-cycle PCR amplification to maintain library complexity. QC Checkpoint: library yield measured by Qubit and fragment size distribution by Bioanalyzer or TapeStation.
4. Illumina Sequencing — Sequencing on Illumina platform (NovaSeq 6000, NovaSeq X, or NextSeq 2000) at the selected depth. QC Checkpoint: Q30 ≥ 85%, cluster density within optimal range, phasing/pre-phasing within specifications.
5. DRAGEN Dual-Omic Analysis — Integrated bioinformatics pipeline: methylation quantification (CpG beta values, coverage tracks, differentially methylated regions) + variant calling (SNV, Indel, CNV, SV) + optional fragmentomic analysis (fragment size profile, WPS, end-motif signatures). QC Checkpoint: alignment rate, duplication rate, transition/transversion ratio, CpG coverage distribution.
| Requirement | Details |
|---|---|
| Genomic DNA | ≥ 50 ng (recommended 100 ng; concentration ≥ 10 ng/µL; OD260/280: 1.8–2.0) |
| cfDNA | ≥ 1 ng (recommended 10–20 ng; predominant fragment peak at ~166 bp) |
| FFPE DNA | ≥ 50 ng (recommended 100 ng; DV200 > 30%) |
| Sample volume | ≥ 10 µL (DNA in low-EDTA TE buffer or nuclease-free water) |
| Sample transport | 1.5 mL LoBind tube, sealed with parafilm; ice packs for short-term (2–8°C); dry ice for long-distance |
| Sample storage | DNA: −20°C (long-term) or 2–8°C (short-term). cfDNA: −20°C. Avoid repeated freeze-thaw cycles |
Note: Sample quality directly impacts 5-base sequencing performance. Degraded DNA (low DV200 for FFPE, significant smearing for gDNA) reduces library complexity and CpG coverage. Samples failing QC thresholds will be flagged and the client contacted before proceeding. For cfDNA, we recommend blood collection in Cell-Free DNA BCT tubes (Streck) or EDTA tubes processed within 2 hours.
Dual-Omic Bioinformatics Pipeline
5-base sequencing data is processed through the DRAGEN dual-omic pipeline, which simultaneously extracts methylation and variant information from the same sequencing reads.
| Module | Description |
|---|---|
| Raw data QC | Q30 scores, read counts, GC content, adapter contamination, duplication rate |
| Alignment | Dual-purpose alignment to reference genome (GRCh38 / T2T-CHM13) with methylation-aware parameters |
| Methylation quantification | Per-CpG methylation beta values (0–100%), coverage depth, CpG coverage metrics; gene body and promoter methylation profiles |
| Differential methylation | Differentially methylated region (DMR) calling between conditions (optional) |
| Variant calling (SNV/Indel) | Whole-genome SNV and Indel detection with joint methylation-aware base quality recalibration |
| Copy number variation (CNV) | Genome-wide copy number profiling from coverage depth |
| Structural variation (SV) | SV detection (deletion, duplication, inversion, translocation) |
| Fragmentomic analysis (optional) | Fragment size distribution, windowed protection score (WPS), end-motif signatures (for cfDNA samples) |
| Integrated visualization | Genome browser tracks for methylation + variants across selected regions |
| Data delivery | All raw and processed data, QC reports, publication-ready figures |
Standard Deliverables
Optional Advanced Analysis
Note: Deliverables and file formats may vary by project. Please contact us for details based on your specific samples and analysis needs.
The validation data below demonstrates 5-base sequencing performance against standard methods (WGBS, WGS, EM-seq) from the same reference samples.
Panel A — Conversion Rate Comparison. C→T conversion rate at non-CpG cytosines compared across 5-base (>97%), EM-seq (~97%), and WGBS (>98%). All three methods achieve high conversion rates, but 5-base achieves this without bisulfite-induced DNA damage. The non-CpG conversion rate (<1%) confirms enzymatic specificity for methylated cytosines.
Panel B — Library Complexity. Duplicate rate and unique molecule recovery from the same input DNA. 5-base consistently recovers >2× more unique molecules than WGBS and approximately 1.3× more than EM-seq, particularly in GC-rich and repetitive regions where bisulfite damage is most severe.
Panel C — CpG Coverage. Total CpG sites detected at ≥1×, ≥5×, and ≥10× coverage. 5-base detects more CpG sites at every coverage threshold compared to WGBS and EM-seq from the same input amount, because greater library complexity translates directly into more unique CpG observations per sequencing read.
Panel D — SNP Detection Accuracy. SNP recall (98.78%) and precision (99.56%) from 5-base data evaluated against standard WGS from the same sample. F1 score of 99.09% demonstrates that 5-base enzymatic conversion does not compromise variant detection sensitivity or specificity.
Panel E — Indel Detection Accuracy. Indel F1 score of 96.25% from 5-base data, comparable to standard WGS from the same sample. Indels in repetitive or GC-rich contexts show particular improvement over bisulfite-converted libraries.
Panel F — Low-Input Performance. Methylation correlation (R²) between standard 100 ng input and serial dilutions (1 ng, 5 ng, 10 ng, 20 ng). 5-base maintains R² > 0.95 down to 1 ng input, while WGBS correlation drops below 0.90 at ≤10 ng. This low-input robustness is critical for cfDNA and clinical applications.
The ability to capture methylation and genetic variation from a single library unlocks applications across multiple research areas where sample quantity or quality has previously limited integrated epigenetic-genetic analysis.
cfDNA from blood plasma carries both methylation signatures (tissue-of-origin, cancer-type-specific methylation patterns) and somatic mutations (driver mutations, clonal hematopoiesis). 5-base sequencing enables both signal types to be extracted from a single cfDNA aliquot — a critical advantage when sample is limited. Applications include multi-cancer early detection (MCED) assay development, minimal residual disease (MRD) monitoring, and treatment response assessment, where combined methylation + mutation analysis improves sensitivity over either modality alone.
Many rare diseases involve both genetic variants (coding mutations, CNVs) and epigenetic dysregulation (imprinting disorders, repeat expansion methylation). 5-base sequencing provides both data types from a single test, simplifying the diagnostic workflow and enabling simultaneous assessment of sequence and methylation abnormalities. Allele-specific methylation analysis is particularly valuable for imprinting disorders and X-linked conditions.
Because 5-base reads methylation and sequence from the same DNA molecules, it can directly phase methylation status to heterozygous variants — determining whether methylation occurs on the maternal or paternal allele. This capability is essential for studying genomic imprinting, X-chromosome inactivation, and allele-specific epigenetic regulation without the need for separate phased sequencing runs.
In crop and livestock genetics, understanding the interplay between genetic variation and epigenetic regulation informs breeding programs. 5-base enables simultaneous QTL and mQTL mapping from the same individuals, linking genetic markers to methylation quantitative trait loci. Applications include stress response epigenetics, heterosis (hybrid vigor) studies, and epigenetic marker-assisted selection.
Clinical archives contain vast numbers of FFPE samples with associated outcome data, but the crosslinked and degraded DNA in FFPE presents challenges for bisulfite-based methods that further fragment the DNA. 5-base’s gentle enzymatic conversion preserves the limited usable DNA in FFPE extracts, enabling methylation and mutation analysis from samples that fail WGBS library construction.
Background: Many rare genetic diseases involve both DNA sequence variants (SNVs, indels, CNVs) and epigenetic dysregulation (methylation episignatures, imprinting disorders). Clinical genetic testing traditionally requires separate workflows — WGS or exome sequencing for variants, and a methylation array or specific methylation assay for epigenetics — doubling cost, sample requirements, and turnaround time. A single-platform solution capable of both variant detection and genome-wide methylation characterization could simplify the diagnostic odyssey for rare disease patients.
Methods: Researchers from MyOme Inc. and the University of Washington evaluated Illumina 5-base sequencing against Oxford Nanopore Technologies (ONT) long-read WGS for simultaneous variant detection and methylation profiling. The study used reference samples NA12878 and NA24385 (with NIST v4.2.1 benchmark variant calls) and clinical samples from patients with Sotos syndrome (NSD1 haploinsufficiency with a known methylation episignature) and Prader-Willi syndrome (imprinting disorder with SNURF-SNRPN hypermethylation). 5-base libraries were prepared using the Illumina 5-Base DNA Prep kit, sequenced to 41–60× coverage on NovaSeq, and analyzed using the DRAGEN dual-omic pipeline with methylation-aware alignment and variant calling.
Results: Illumina 5-base achieved SNV/Indel recall of 99.0% on NA12878 and 99.4% on NA24385, compared to 97.7% and 97.3% respectively for ONT long-read WGS. Genome-wide CpG methylation levels showed a mean correlation of 0.93 between 5-base and ONT platforms. Both technologies clearly distinguished Sotos syndrome and Prader-Willi syndrome samples from normal controls by principal component analysis of methylation profiles, and both correctly detected the characteristic SNURF-SNRPN hypermethylation in Prader-Willi samples. The 5-base DRAGEN pipeline completed integrated methylation + variant analysis in approximately one hour per genome.
Conclusion: Illumina 5-base sequencing provides accurate, simultaneous variant detection and genome-wide methylation characterization from a single library, with performance matching or exceeding long-read alternatives. This single-platform approach has direct clinical utility for rare disease diagnostics where both genetic and epigenetic etiologies are under consideration.
Source: P207. Development of a single sequencing platform for variant detection and methylation characterization. Genetics in Medicine Open. 2026;4(Suppl 1). URL: https://www.gimopen.org/article/S2949-7744(26)00211-6/fulltext.
| Feature | WGBS (Bisulfite) | EM-seq (Enzymatic) | 5-Base Sequencing |
|---|---|---|---|
| Conversion target | Unmethylated C → U (global) | Unmethylated C → U (global) | Methylated C (5mC) → T (targeted) |
| Library prep workflow | Library prep first, then conversion | Conversion first, then library prep | Library prep first, then conversion |
| C→T conversion rate | ~99% | ~99% | ~98% |
| Base diversity after conversion | Low (C depleted) | Low (C depleted) | Relatively balanced |
| Detection output | 5mC only | 5mC only | 5mC + SNP + Indel + CNV + SV |
| DNA input amount | gDNA ≥ 2 µg | gDNA ≥ 300 ng, cfDNA ≥ 10 ng | gDNA ≥ 200 ng, cfDNA ≥ 20 ng |
| Compatible sample types | Tissue, cells, whole blood | Tissue, cells, whole blood, FFPE, cfDNA | Tissue, cells, whole blood, FFPE, cfDNA |
| Recommended applications | Routine methylation profiling | Low-input methylation projects | Methylation + variant dual-omics |
Which Method to Choose: 5-base sequencing is the appropriate choice when you need both methylation and variant data from the same sample. For methylation-only studies with ample DNA input, WGBS or EM-seq remain viable options. However, for any study involving limited samples (cfDNA, FFPE), requiring simultaneous variant detection, or demanding dual-omic analysis from a single workflow, 5-base sequencing provides information that conventional methods cannot deliver.
Get a Project ConsultationFor research use only. Not for use in diagnostic procedures.
Terms & Conditions Privacy Policy Copyright © CD Genomics. All rights reserved.
Quote Request