Sample & DNA Requirements for T2T Sequencing: How to Avoid Project Failure

Introduction – The Critical Prerequisite

In the current era of genomics, telomere-to-telomere (T2T) sequencing has transitioned from an ambitious goal to an achievable reality for many research groups and service providers. By leveraging ultra-long reads from Oxford Nanopore Technologies (ONT) and highly accurate long reads from Pacific Biosciences (PacBio HiFi), it is now possible to generate complete, gap-free chromosome assemblies — including the most structurally complex and repetitive regions of eukaryotic genomes such as centromeres, telomeres, segmental duplications, and ribosomal DNA arrays.

However, behind every successful T2T assembly lies a critical and often underestimated prerequisite: exceptionally high-quality, high-molecular-weight (HMW) DNA. Industry experience, shared both in publications and in private project debriefs, consistently shows the same sobering statistic:

More than 50% of long-read sequencing projects — particularly those aiming for T2T-level contiguity — either fail outright or deliver significantly underperforming assemblies due to poor input DNA quality.

This high failure rate is not primarily caused by sequencer issues, library preparation chemistry, or bioinformatics challenges (although those can certainly compound problems). The dominant root cause, by a wide margin, is inadequate DNA integrity and size distribution at the very beginning of the project.

High-molecular-weight DNA is conventionally defined as having a substantial proportion of fragments >50 kb, with the sweet spot for most contemporary T2T projects lying between 100 kb and >300 kb, and the most ambitious ultra-long ONT efforts targeting molecules well above 500 kb–1 Mb. When DNA meets or exceeds these size thresholds, sequencers can produce the ultra-long reads necessary to span the longest repetitive regions, dramatically improving the chance of generating chromosome-scale contigs with N50 values in the tens to hundreds of megabases.

The landmark T2T-CHM13 human genome assembly (published in 2022) serves as the canonical proof of concept. That project combined ultra-long Oxford Nanopore reads (average N50 ~120 kb, with many reads >1 Mb) with PacBio HiFi data for polishing. Achieving this read-length profile required obsessive attention to DNA extraction and handling protocols — protocols that are now widely disseminated but still challenging to reproduce at scale, especially with limited starting material or challenging sample types.

Why is degraded or sheared DNA so catastrophic for T2T goals?

  • Fragmented input DNA (<50–80 kb modal size) cannot produce the long continuous reads needed to resolve long tandem repeats, segmental duplications, and other structurally complex loci.
  • Shorter molecules lead to lower effective library complexity, more chimeric reads, and increased susceptibility to coverage bias.
  • Assembly contiguity collapses: N50 drops from chromosome-scale to megabase-scale or worse, and many gaps remain in the most biologically interesting (and hardest) regions.
  • Polishing becomes more difficult because HiFi data cannot compensate for missing long-range information.
  • Overall project cost-per-base and timeline increase dramatically due to the need for additional sequencing to compensate for poor contiguity.

For lab managers, principal investigators, and CRO project managers, these consequences translate directly into wasted budget, delayed publications/grants/milestones, dissatisfied collaborators or clients, and — in the worst cases — complete project abandonment.

The most accessible and informative way to visualize the difference between project-saving and project-killing DNA is pulsed-field gel electrophoresis (PFGE), the gold-standard method for assessing true HMW DNA quality.

Pulsed-field gel electrophoresis showing high-quality HMW DNA versus degraded/sheared DNAFigure 1: Pulsed-field gel electrophoresis showing high-quality HMW DNA (>500 kb modal size, minimal shearing) versus degraded/sheared DNA (prominent smear below 100 kb).

The left lane in such gels typically shows a bright, high-molecular-weight band with little downward smearing — the signature of DNA that is likely to produce excellent long-read data. The right lane, by contrast, displays heavy smearing toward lower molecular weights — the hallmark of DNA that will almost certainly lead to fragmented assemblies, even with the best current sequencing chemistries and assembly algorithms.

Mastering HMW DNA preparation is therefore not merely a technical detail; it is the single most important determinant of whether a T2T project will succeed or become another cautionary tale.

This practical guide is designed to help you avoid the most common (and costly) pitfalls by walking through the exact DNA quantity, purity, size, extraction, quality control, and sample-specific best practices required for reliable T2T-level results.

Key DNA Requirements for Success

To consistently achieve telomere-to-telomere (T2T) assemblies — meaning chromosome-scale contigs with minimal or zero gaps in even the most repetitive regions — four DNA quality parameters must be tightly controlled: quantity, purity, fragment length distribution, and extraction/handling gentleness. These are not independent variables; problems in one area almost always cascade into the others.

1. Quantity — How Much Is Really Enough?

While many sequencing service providers list minimum input amounts (e.g., "1 μg for PacBio HiFi", "400 ng for ONT ligation"), these are survival minima, not T2T optima.

For reliable T2T-grade projects in 2025–2026:

  • ONT ultra-long native libraries (SQK-ULK114 or equivalent): 8–15 μg of HMW DNA per PromethION flow cell is strongly recommended when targeting N50 >100 kb and maximum molecule length >500 kb. Many successful ultra-long projects start with 20–30 μg total yield to allow multiple attempts and size selection.
  • ONT standard ligation (R10.4.1 + Kit14 chemistry): 3–8 μg per flow cell, but lower inputs frequently result in lower effective read N50.
  • PacBio Revio / HiFi (SMRTbell Express Template Prep Kit 2.0): 1–5 μg per library, but the best polishing-grade HiFi datasets for T2T usually come from ≥3 μg inputs with high library complexity.
  • Combined hybrid strategy (most common successful T2T approach): Plan to extract at least 25–40 μg total HMW DNA so you can run both ultra-long ONT and HiFi polishing in parallel or sequentially.

Why so much? Library conversion efficiency is never 100%. Size selection (especially for ultra-long molecules), bead cleanups, adapter ligation losses, and failed flow cells all consume material. Low starting quantity also reduces library diversity, which increases the risk of coverage bias and chimeric artifacts — both fatal for complex repeat resolution.

Practical tip: Always quantify post-extraction using fluorometric methods (Qubit dsDNA BR or HS assay). Spectrophotometry (NanoDrop) routinely overestimates concentration by 20–100% in the presence of even trace RNA, proteins, or phenols — a very common cause of "mysterious" low library yields.

2. Purity — The Silent Project Killer

Even pristine HMW DNA can be rendered useless by trace contaminants. The most important purity metrics are:

  • A260/A280: 1.80–2.05 (ideal 1.85–1.95) → indicates low protein contamination
  • A260/A230: ≥2.0 (ideally ≥2.2) → critical for absence of organic carryover (phenol, guanidine, polysaccharides)
  • A260/A270 (less commonly checked): close to 1.2 indicates minimal residual phenol
  • RNA contamination: should be <5–10% of total signal on fluorometric assay after RNase treatment

Common contaminants and their effects:

  • Residual phenol/chloroform → blocks ONT pores within minutes
  • Polysaccharides (very common in plants) → viscous solutions, poor bead binding, pore clogging
  • Salts (high NaCl, EDTA >1 mM) → inhibit ligation and polymerase processivity
  • RNA → competes for adapter ligation, inflates apparent DNA concentration
  • Proteins/proteases → degrade DNA during storage or library prep

Mitigation strategies (in order of importance):

  1. Use wide-bore tips and gentle inversion throughout — never vortex after lysis.
  2. Perform at least two rounds of magnetic bead cleanup (AMPure XP or equivalent) with optimized binding buffer ratios.
  3. Include an on-column or post-elution RNase A treatment (followed by another bead cleanup).
  4. Elute in low-EDTA TE (0.1 mM EDTA) or EB buffer.
  5. For polysaccharide-rich samples (plants, fungi), consider additional CTAB precipitation or commercial polysaccharide removal columns.

Key Purity Metrics and Common Contaminants for Long-Read / T2T Sequencing

Metric / Contaminant Ideal Range / Threshold Indicates / Effect Mitigation Strategies (Top Priorities)
A260/A280 1.80–2.05 (ideal 1.85–1.95) Low protein contamination Gentle handling; multiple bead cleanups
A260/A230 ≥2.0 (ideally ≥2.2) Absence of organics (phenol, guanidine, polysaccharides) Extra washes; CTAB for plants
A260/A270 (optional) ~1.2 Minimal residual phenol Additional cleanups
RNA contamination <5–10% (fluorometric) Inflates quantity; competes in ligation Early RNase A treatment + bead cleanup
Residual phenol/chloroform Absent Blocks ONT pores quickly Thorough washes; avoid if possible
Polysaccharides (plants/fungi) Absent Viscosity; poor binding; pore clogging CTAB precipitation or removal columns
High salts/EDTA (>1 mM) Absent/low Inhibits ligation/polymerase Optimized buffer ratios; low-EDTA elution

3. Fragment Length — The Single Most Important Predictor of T2T Success

Current consensus (2025–2026) among T2T practitioners:

  • Minimum acceptable: modal size >80–100 kb with substantial proportion >200 kb
  • Good / reliable for T2T: modal size 150–300 kb, with clear tail extending >500 kb
  • Excellent / ultra-long capable: significant proportion >500 kb–1 Mb, modal size often >250 kb

Assessment methods (ranked by reliability):

  1. Pulsed-field gel electrophoresis (PFGE) — still the gold standard for true size distribution
  2. Agilent Femto Pulse — excellent automated alternative, gives precise sizing up to ~165 kb
  3. Agilent TapeStation HS — useful up to ~60–80 kb, but underestimates longer tails
  4. Bioanalyzer — only suitable for initial QC; cannot resolve true HMW range

Step-by-step schematic of a gentle HMW DNA extraction workflow optimized for T2T sequencingFigure 2: Step-by-step schematic of a gentle HMW DNA extraction workflow optimized for T2T sequencing.

The figure shows main stages with icons: sample lysis with proteinase K (gentle inversion mixing), magnetic disk/bead binding, multiple wash steps, careful elution with wide-bore pipette, final QC via PFGE showing a high-MW band, and side annotations highlighting "no vortexing", "wide-bore tips", "low-EDTA buffer".

4. Extraction & Handling — Where Most Projects Are Won or Lost

The single best predictor of success is whether the extraction protocol was deliberately designed to minimize shear forces at every step.

Recommended modern kits/protocols (2025–2026):

  • PacBio Nanobind (now the de-facto standard for many service labs)
  • QIAGEN MagAttract HMW DNA Mini/Midi
  • NEB Monarch HMW DNA Extraction Kit
  • Circulomics (now PacBio) Nanobind Big DNA kits (still excellent for ultra-long)
  • ONT recommended gentle lysis + bead protocols

Critical handling rules (non-negotiable for T2T):

  • Never vortex after cell lysis
  • Use wide-bore pipette tips for all transfers >50 μL
  • Centrifuge at low g-forces when possible
  • Avoid repeated freeze-thaw cycles (ideally store at 4°C for short term, or -80°C in single aliquots)
  • Work quickly once lysis begins to minimize nuclease activity

Common failure patterns (and fixes):

  • "My PFGE looks smeared but quantity is high" → too much mechanical shear during lysis/homogenization
  • "Library prep failed despite good PFGE" → hidden contaminants (phenol, high salt)
  • "Read N50 is only 30 kb despite good input" → DNA was sheared during pipetting or size selection

By strictly controlling these four pillars — quantity, purity, fragment length, and gentleness — experienced groups now routinely achieve the read-length profiles necessary for true T2T assemblies.

Sample Types and Best Practices

While the core DNA requirements (quantity, purity, fragment length, gentleness) remain universal, the optimal workflow varies significantly depending on the sample type and organism. Poor adaptation of protocols to the specific biology of the input material is one of the top three reasons T2T projects underperform — right behind general shearing and hidden contaminants.

Here we outline current best practices (as of early 2026) for the most common sample categories encountered in T2T projects: mammalian cells, whole blood, animal tissues, plant tissues, and a few notes on simpler organisms (bacteria, fungi, insects). These recommendations synthesize guidelines from the Telomere-to-Telomere Consortium follow-on efforts, PacBio/ONT vendor documentation, and inter-laboratory benchmarking studies.

1. Mammalian Cultured Cells / Cell Lines (e.g., human CHM13, GM12878, animal cell lines)

Recommended input: 5–15 million cells (fresh or freshly thawed, log-phase growth preferred). Best extraction kits/methods (ranked by ultra-long read performance in recent benchmarks):

  • PacBio Nanobind PanDNA or CBB Big DNA kit → most consistent for 50–300+ kb modal size, excellent purity
  • NEB Monarch HMW DNA Extraction Kit for Cells & Blood (T3050) → very reliable, slightly lower yield but high integrity
  • QIAGEN MagAttract HMW DNA Kit → good for automation, 100–200 kb range

Key best practices:

  • Harvest at 70–90% confluency; avoid over-confluence (increased debris/nuclease activity).
  • Pellet gently (300–500 × g, 5 min, 4°C); resuspend in PBS + 2% FBS if storing briefly.
  • Immediate lysis after pelleting — do not allow pellets to sit.
  • Include RNase A treatment early; many cell lines have high RNA content.
  • Avoid freeze-thaw of cell pellets if possible; if unavoidable, snap-freeze in liquid N₂ and store at –80°C.

Expected outcome: Reliably >150 kb modal fragment size; many projects achieve >500 kb tails with Nanobind.

2. Whole Blood (Human, Mammalian, Avian, etc.)

Recommended input: 1–10 mL fresh whole blood in K₂-EDTA tubes (never heparin — strong polymerase inhibitor). Process within 24 h or store at 4°C ≤7–10 days. Best kits:

  • Nanobind CBB HT (high-throughput) or PanDNA
  • NEB Monarch for Cells & Blood
  • Thermo Fisher MagMAX HMW DNA Kit (optimized for blood)

Best practices:

  • Isolate peripheral blood mononuclear cells (PBMCs) via Ficoll-Paque gradient for highest purity/yield (target 6–10 million PBMCs).
  • For nucleated blood (birds, reptiles), smaller volumes (0.5–2 mL) suffice due to higher WBC count.
  • For ultra-long ONT (SQK-ULK114), vendors recommend starting with 3–6 mL blood → aim for 6–10 million WBCs.
  • Avoid buffy coat isolation if possible — it concentrates platelets and can increase inhibitors.
  • Flash-freeze aliquots if delay inevitable, but fresh is always superior.

Pitfall to avoid: Delayed processing (>48 h at room temp) causes rapid degradation → read N50 drops below 50 kb even with good kits.

3. Animal Tissues (Liver, Muscle, Brain, etc.)

Recommended input: 100–500 mg fresh or flash-frozen tissue. Best kits:

  • Nanobind Tissue kit (if available) or PanDNA with homogenization
  • NEB Monarch HMW for Tissue
  • QIAGEN MagAttract + additional lysis optimization

Best practices:

  • Snap-freeze in liquid N₂ immediately after dissection; store at –80°C.
  • Homogenize in liquid N₂ using mortar/pestle or gentle bead-beating — never use rotor-stator homogenizers (severe shearing).
  • Fibrous tissues (muscle, tendon) benefit from extended proteinase K digestion (2–4 h).
  • For brain/eye (high lipid), add extra washes or chloroform cleanup step.
  • Yield target: ≥15–30 μg HMW DNA.

Common issue: Polysaccharides/lipids in some tissues → use additional polysaccharide removal columns or CTAB precipitation if needed.

4. Plant Tissues (Leaves, Roots, Seeds)

Recommended input: 1–5 g fresh young leaves (minimize secondary metabolites). Best kits/methods:

  • Modified CTAB + Nanobind cleanup (gold standard for plants)
  • NEB Monarch + plant-specific lysis buffer
  • QIAGEN DNeasy Plant Maxi + HMW cleanup

Best practices:

  • Use young, actively growing tissue — older leaves have more lignins/polysaccharides.
  • Grind in liquid N₂; add PVP or β-mercaptoethanol to lysis buffer to bind phenolics.
  • Multiple rounds of chloroform:isoamyl alcohol extraction to remove polysaccharides.
  • For woody/seed tissues: extended lysis and nuclear enrichment first.

Note: Plants often require UHMW (>500 kb) for centromere/telomere resolution due to larger genomes/repeats — prioritize ultra-long protocols.

5. Other Organisms (Brief Notes)

  • Bacteria: 10⁹–10¹⁰ cells → add lysozyme/achromopeptidase; Nanobind or Monarch works well (shorter fragments acceptable).
  • Fungi/Insects: Similar to plants — nuclear enrichment + CTAB helpful.
  • Complex samples (e.g., environmental, metagenomic): Consider size selection post-extraction for T2T goals.

Regardless of sample type, always perform pre-project pilot extractions (2–3 replicates) and QC rigorously.

Comparative pulsed-field gel electrophoresis (PFGE) showing DNA fragment size and integrity from four different high-molecular-weight (HMW) extraction methodsFigure 3: Comparative pulsed-field gel electrophoresis (PFGE) showing DNA fragment size and integrity from four different high-molecular-weight (HMW) extraction methods (Nanobind, Fire Monkey, Puregene, and Genomic-tip) on a reference cell line.

The gel demonstrates: a more prominent high-MW peak (>80 kb, minimal smearing, excellent integrity ideal for T2T/long-read) in Nanobind extracts; moderate smearing (30–80 kb dominant, less clear HMW tail) in the other methods (representing older/contaminant-prone or less optimal tissue/blood handling). This illustrates how kit choice impacts DNA quality across sample types like cells/blood (high integrity with Nanobind) vs. tissue (higher risk of degradation/smearing).

Summary of Best Practices for HMW DNA Extraction by Sample Type

Sample Type Recommended Input Top-Ranked Kits (for Ultra-Long Performance) Key Best Practices / Pitfalls to Avoid Expected Modal Fragment Size
Mammalian Cultured Cells 5–15 million cells (log-phase)
  1. Nanobind PanDNA/CBB,
  2. NEB Monarch,
  3. QIAGEN MagAttract
Harvest 70–90% confluency; immediate lysis; RNase early; avoid freeze-thaw >150 kb (up to >500 kb tails)
Whole Blood 1–10 mL fresh (K₂-EDTA) 1. Nanobind CBB HT/PanDNA, 2. NEB Monarch, 3. Thermo MagMAX PBMC isolation via Ficoll; process ≤24 h; avoid heparin/buffy coat High integrity, >150 kb
Animal Tissues 100–500 mg fresh/frozen 1. Nanobind Tissue/PanDNA, 2. NEB Monarch, 3. QIAGEN MagAttract Snap-freeze; liquid N₂ homogenization; extended digestion for fibrous ≥15–30 μg yield; variable but high with care
Plant Tissues 1–5 g young leaves 1. Modified CTAB + Nanobind, 2. NEB Monarch + plant buffer, 3. QIAGEN DNeasy Plant Young tissue; PVP/β-mercaptoethanol; multiple chloroform extractions Often needs >500 kb for UHMW
Other (Bacteria/Fungi/Insects) Varies (e.g., 10⁹–10¹⁰ cells) Nanobind or Monarch Lysozyme/CTAB; nuclear enrichment Shorter acceptable

Cross-reference: The direct impact of these sample-specific choices on final assembly metrics is covered in detail in our companion guide: Assembling the Hard Parts: Telomeres, Centromeres, and Segmental Duplications in the T2T Era.

Conclusion

In the era of telomere-to-telomere (T2T) sequencing, the sequencing platforms (ONT, PacBio) and assembly algorithms (hifiasm, verkko, TULIP, etc.) have reached a very mature stage. Yet, whether a project can ultimately deliver a truly complete, gap-free, chromosome-scale assembly still depends overwhelmingly on the very first step: preparation of high-quality, high-molecular-weight (HMW) DNA.

Repeated real-world data and post-project debriefs have shown the same pattern:

  • When input DNA consistently achieves a modal fragment length ≥150 kb, excellent purity metrics, and sufficient total yield, more than 90% of T2T projects deliver highly satisfactory results (contig N50 >50 Mb, majority at chromosome level, BUSCO completeness >99%).
  • Conversely, when DNA shows clear degradation (<80 kb modal size), heavy contamination, or insufficient quantity, even massive additional sequencing depth and computational effort rarely compensate for the loss of long-range continuity — resulting in assemblies that are "almost complete but still gapped" or severely fragmented.

For lab managers, principal investigators, and CRO project managers, controlling sample and DNA quality is therefore not an optional optimization — it is the single biggest determinant of project success or failure.

Here is the most practical, battle-tested final DNA preparation & QC checklist for T2T projects (recommended to print and check off before every project kick-off):

Final T2T DNA Preparation & QC Checklist

  • Sample freshness: Prioritize fresh or snap-frozen material; whole blood ≤7–10 days at 4°C, tissues/cells avoid repeated freeze-thaw
  • Extraction method: Use modern HMW-optimized kits (Nanobind, Monarch, MagAttract, etc.); no vortexing after lysis
  • Handling rules: Wide-bore tips only, slow pipetting; immediate low-temperature work post-lysis, include nuclease inhibitors
  • Yield target: ≥20–30 μg HMW DNA for human/complex genomes; scale down appropriately for smaller genomes but never below 10 μg
  • Purity metrics: A260/A280 = 1.80–2.05, A260/A230 ≥ 2.0; RNA contamination <10%
  • Fragment size QC: PFGE or Femto Pulse shows modal ≥150 kb with clear tail >300–500 kb
  • Reproducibility check: Run 2–3 independent pilot extractions to confirm method stability
  • Contaminant mitigation: Add extra bead cleanups or polysaccharide/phenolic removal steps when needed (especially plants)
  • Storage: Short-term at 4°C (≤1 week), long-term in single-use aliquots at –80°C, no repeated freeze-thaw

When you can reliably deliver DNA that meets these criteria, the probability of T2T success increases dramatically, and downstream assembly challenges drop significantly.

Ready to submit your samples? Feel free to review our detailed sample submission guidelines or contact our technical support team for pre-project consultation. We can help evaluate your current extraction workflow, suggest targeted optimizations, or even assist with pilot extractions if needed.

Thank you for reading this practical guide. We hope these recommendations help your next T2T project clear the most critical hurdle — DNA quality — and successfully reach a complete, seamless, telomere-to-telomere genome assembly.

Related articles: ← Back to basics: Telomere-to-Telomere (T2T) Sequencing Explained: When You Need a Complete Genome → Understand downstream impact: T2T Assembly QC Metrics: Completeness, Accuracy, and How to Evaluate Results → Deliverables & data formats:Choosing the Right T2T Deliverables: Assembly Outputs, Polishing, Phasing, and Data Formats (RUO)

References:

  1. Sergey Nurk et al. The complete sequence of a human genome. Science 376, 44-53 (2022). https://doi.org/10.1126/science.abj6987
  2. Angthong P, Uengwetwanit T, Pootakham W, Sittikankaew K, Sonthirod C, Sangsrakru D, Yoocha T, Nookaew I, Wongsurawat T, Jenjaroenpun P, Rungrassamee W, Karoonuthaisiri N. 2020. Optimization of high molecular weight DNA extraction methods in shrimp for a long-read sequencing platform. PeerJ 8:e10340 https://doi.org/10.7717/peerj.10340
  3. Devonshire, A.S., Morata, J., Jubin, C. et al. Interlaboratory evaluation of high molecular weight DNA extraction methods for long-read sequencing and structural variant analysis. BMC Genomics 26, 698 (2025). https://doi.org/10.1186/s12864-025-11792-7
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
Related Services
PDF Download
* Email Address:

CD Genomics needs the contact information you provide to us in order to contact you about our products and services and other content that may be of interest to you. By clicking below, you consent to the storage and processing of the personal information submitted above by CD Genomcis to provide the content you have requested.

×
Quote Request
! For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
Contact CD Genomics
Terms & Conditions | Privacy Policy | Feedback   Copyright © CD Genomics. All rights reserved.
Top