Modern breeding programs now thrive or falter based on the quality of their reference genomes.
T2T genome assembly and haplotype-resolved genome assembly in crops and livestock give you a high-quality reference genome for breeding, revealing variants and haplotypes that fragmented drafts cannot show.
TL;DR – What this guide will help you decide
- When a draft reference is "good enough" vs when you really need a T2T genome assembly, telomere-to-telomere plant genome, or haplotype-resolved assembly.
- What breeders can actually do with T2T genome assembly and haplotype-based breeding tools in real crop and livestock projects.
- How to design data types, assembly strategy, and budget so a premium reference supports clearer QTLs, structural variants, and more reliable markers.
- How to work with CD Genomics' T2T Genome Sequencing, Haplotype-Resolved Genome Sequencing & Assembly, and crop/livestock de novo genome assembly services to upgrade your current breeding pipeline (for research use only; not for individual or personal testing).
Figure 1. From draft genomes to T2T and haplotype-resolved assemblies.
Why are draft reference genomes holding back breeding decisions?
Draft reference genomes limit breeding decisions because gaps, collapsed repeats, and misassembled regions blur QTL signals, hide structural variants, and make markers less stable across germplasm.
In practice, most "first-generation" crop and livestock references:
- Break in pericentromeric and repeat-rich regions.
- Collapse similar sequences from duplicated genes or homeologous chromosomes.
- Leave centromeres, telomeres, and some segmental duplications unresolved.
These technical issues create very practical breeding problems:
- Blurry QTL and GWAS peaks
- Signals spread across large, messy intervals because recombination patterns and marker positions are not fully captured.
- Fine mapping stalls when candidate regions still contain dozens of genes.
- Hidden or distorted structural variants (SVs)
- Inversions, translocations, copy number variants, and presence–absence variants may be misrepresented or missed.
- Important disease-resistance loci or quality traits linked to SVs do not show up clearly in SNP-only scans.
- Unstable or hard-to-transfer markers
- Markers designed on a fragmented reference can fail in new germplasm, especially across heterotic groups, subspecies, or breeds.
- Lifting markers between genome versions is painful, and internal teams lose confidence in "legacy" markers.
In short, if your team feels that QTL never quite resolve, or that GWAS hits move with every new analysis, it may not be your statistics. It may simply be that your current reference genome has reached its useful limit.
What is a T2T genome and a haplotype-resolved assembly in crops and livestock?
A T2T genome assembly is a telomere-to-telomere, chromosome-level genome sequence with no unresolved gaps, while a haplotype-resolved assembly separates each chromosome copy into distinct, phased haplotypes for crops and livestock.
Concretely:
- A telomere-to-telomere plant genome aims to reconstruct each chromosome as a single, gapless sequence, including centromeres and repeat-rich regions.
- A haplotype-resolved genome assembly in crops and livestock produces separate sequences for each parental chromosome, instead of collapsing them into a single consensus.
These concepts translate directly into breeding value:
- T2T assemblies reduce uncertainty in tricky genome regions, improving read mapping and variant calling.
- Haplotype-resolved assemblies show exactly which allele combinations travel together as "haplotype blocks," which is how breeders actually move variation through populations.
Figure 2. Telomere-to-telomere, haplotype-resolved melon genome. Chromosome-level Hi-C contact map and genome landscape of the semi-wild melon accession 821 illustrate how a T2T, phased reference resolves chromosomes, gene density, and repeat content for downstream trait mapping (Li G. et al. (2023) Horticulture Research).
Quick definitions you can reuse in slides and reports
- T2T genome: A gapless chromosome assembly from one telomere to the other, including centromeres and repeats.
- Haplotype-resolved assembly: A genome assembly where each chromosome copy is reconstructed separately and phased.
- High-quality reference genome for breeding: A curated assembly with high contiguity, low error rates, and strong support from genetic, physical, and mapping data.
Recommended Services for This Step:
Learn More:
What can breeders actually do with T2T and haplotype-resolved genomes?
Breeders can use T2T and haplotype-resolved reference genomes to sharpen QTL mapping, detect structural variants, and design haplotype-based selection schemes that are difficult to implement with fragmented drafts.
Sharpen QTL mapping and GWAS in complex genomes
High-quality references improve QTL and GWAS not by changing your statistics, but by improving the underlying coordinates and recombination map.
Figure 3. Chromosome-scale T2T and haplotype-resolved tree genome. A telomere-to-telomere, phased assembly of Chinese cork oak shows chromosome contact maps, gene density, and repeat patterns, demonstrating the type of high-quality reference that stabilizes QTL and GWAS coordinates (Wang L. et al. (2023) Frontiers in Plant Science).
With a T2T genome assembly or curated high-quality reference:
- Markers map more accurately in pericentromeric and low-recombination regions.
- Recombination breakpoints are placed more precisely, narrowing QTL intervals.
- Historical and new datasets align onto the same, stable coordinate system.
Breeding outcomes you can expect:
- QTL intervals shrink from tens of megabases to a few candidate genes.
- "Wandering" GWAS peaks become more consistent across trials and environments.
- Marker development projects face fewer surprises when moved into diverse germplasm.
If your team is planning a major genomic selection in plant and animal breeding initiative, investing in a more reliable reference first helps avoid carrying reference biases into training sets and prediction models.
Detect structural variants that drive major traits
Structural variants are major drivers of agronomic traits in many crops and livestock. A high-quality reference genome for breeding makes these variants much easier to detect and interpret.
With T2T or haplotype-resolved assemblies, you gain:
- Accurate breakpoint positions for inversions, translocations, and large indels.
- Better estimates of copy number for gene families involved in defense, flowering, quality, or fertility traits.
- Clear views of presence–absence variation across a pan-genome, especially when you combine multiple assemblies.
Figure 4. Structural variants highlighted on a T2T melon reference. Structural variant comparisons between cultivated and semi-wild melon haplotypes show copy-number and presence–absence changes at disease-resistance and quality loci, illustrating how T2T, haplotype-resolved assemblies expose trait-linked SVs (Li G. et al. (2023) Horticulture Research).
For breeders, this means you can:
- Design SV-based markers that track key structural haplotypes across breeding material.
- Classify donor lines and elite parents by their structural haplotype content.
- Combine genomic selection with targeted introgression of favorable SV haplotypes.
Design haplotype-based breeding and selection schemes
Haplotype-based breeding uses blocks of linked alleles rather than single SNPs as the unit of selection. A haplotype-resolved assembly provides the template for defining these blocks.
With a good phased assembly and supporting data, you can:
Figure 5. Haplotype diversity at a key starch gene. An example haplotype network and group-wise haplotype distribution for the rice GBSSI gene illustrates how functional haplotypes differ between wild and cultivated groups—exactly the type of pattern breeders can target when using haplotype-resolved assemblies (Maung T.Z. et al. (2021) Frontiers in Plant Science).
- Define favorable haplotypes around major genes and QTL, then design markers that tag the entire haplotype instead of single SNPs.
- Track recombination events that break favorable haplotypes, so you know when a line has lost its desired combination.
- Incorporate haplotype descriptors into genomic prediction models, particularly for traits with major-effect loci and strong epistasis.
In practice, T2T and haplotype-resolved references let you say things like:
- "Tag the full resistance haplotype rather than one SNP in the gene."
- "Select parents that complement each other at the haplotype level, not just at single markers."
- "Prioritize crosses that generate new combinations of favorable haplotypes instead of reshuffling the same ones."
Draft vs high-quality vs T2T / haplotype-resolved references: which does your breeding program need?
You can treat draft references as a starting point, high-quality curated assemblies as the workhorse for routine genomic selection, and T2T or haplotype-resolved genomes as strategic infrastructure for long-term haplotype-based breeding.
A simple way to compare them is in three tiers:
| Reference type |
Contiguity & gaps |
Complex regions (centromeres, repeats) |
Structural variants |
Phasing |
Typical uses |
| Draft reference |
Fragmented, many gaps |
Poorly resolved |
Many missed or mis-located |
None |
Basic SNP discovery, initial GWAS |
| High-quality curated |
Long contigs, few gaps |
Partially resolved |
Good for larger SVs in euchromatin |
Limited |
Routine GS, MAS, better QTL mapping |
| T2T / haplotype-resolved |
Chromosome-level, gapless or near-gapless |
Resolved centromeres and repeats |
Comprehensive SV discovery and mapping |
Diploid/polyploid phasing |
Haplotype-based breeding, pan-genome anchors, strategic trait discovery |
When is T2T genome assembly worth the investment?
A full telomere-to-telomere genome assembly or a haplotype-resolved genome assembly for crops and livestock is usually justified when:
- Priority trait loci sit in complex or low-recombination regions, and QTL have stalled for several years.
- You are dealing with a large, repetitive, or polyploid genome with strong subgenome effects.
- Your organization is planning long-term pan-genome and structural variant resources and needs a durable anchor reference.
- You are building a crop or livestock de novo genome assembly that will serve as a community or corporate standard.
From a cost–benefit perspective, a T2T genome sequencing project is a focused, one-time investment. The payoff comes over many cycles through:
- Fewer failed or re-designed marker assays.
- Shorter time from QTL discovery to stable markers in elite germplasm.
- Improved genomic prediction stability as you expand germplasm and traits.
If budget is limited, one practical strategy is to start with a high-quality reference upgrade and plan T2T or haplotype-resolved work later for one key founder or elite line.
How do you design a T2T or haplotype-resolved genome project?
To design a T2T or haplotype-resolved genome project, you need to plan DNA quality, choose appropriate sequencing platforms, and define a clear assembly and validation workflow that links directly to breeding goals.
Choose data types and platforms
T2T and haplotype-resolved assemblies usually combine several data types:
- Long-read sequencing (high-fidelity or ultra-long) for contig building and resolving repeats.
- Short-read sequencing for polishing and small-variant calling.
- Hi-C or other long-range scaffolding for ordering and orienting contigs into chromosomes.
- Optional optical maps or genetic maps for validation and troubleshooting.
Practical tips from real projects:
- Prioritize DNA quality over raw depth. Ultra-high molecular weight DNA from a few carefully handled samples often works better than many moderate-quality samples.
- For highly repetitive plant genomes, plan for higher long-read coverage than for smaller animal genomes.
- Decide up front whether your primary goal is T2T only or T2T plus haplotype resolution. Haplotype-resolved genome assembly in crops and livestock may need trio-based designs, deep coverage, or extra phasing data.
CD Genomics can help you design this mix through its T2T Genome Sequencing and Haplotype-Resolved Genome Sequencing & Assembly services for crop and livestock breeding projects (research use only; no personal or clinical applications).
Follow a clear assembly, polishing, and validation workflow
A typical high-end assembly workflow includes:
- Sample selection and DNA QC
- Choose a line that matters to your breeding program, not just a convenient inbred.
- Set clear QC thresholds for DNA integrity, purity, and yield before library prep.
- Long-read sequencing and initial assembly
- Generate enough coverage to assemble through repeats and segmental duplications.
- Build contigs and evaluate N50, total size, and known benchmark regions.
- Scaffolding and polishing
- Apply Hi-C or other long-range data to scaffold contigs into chromosome-scale sequences.
- Polish using short reads and, if needed, additional long-read passes to reduce base-level errors.
- Phasing for haplotype-resolved assemblies
- Use parental data, trio binning, or statistical phasing to assign contigs to specific haplotypes.
- Verify that known heterozygous regions appear in both haplotypes as expected.
- Biological validation
- Compare the assembly against genetic maps, known QTL, and well-characterized genes.
- Check that important loci for your traits of interest are complete and consistent with existing marker data.
When you collaborate with a partner like CD Genomics, you can treat these steps as a structured project plan, with clear deliverables for breeding and bioinformatics teams. All services are provided strictly for research use and are not intended for individual testing or diagnostic use.
Integrate the new reference into your breeding pipeline
The work is only half done when the assembly is finished. The other half is integration into routine workflows.
Key steps:
- Lift over existing markers and QTL
- Convert coordinates from your old reference to the new assembly.
- Retire or re-design markers that no longer uniquely map or fall in low-quality regions.
- Update genotyping and imputation resources
- Re-annotate SNP arrays and GBS markers against the new reference.
- Re-train imputation panels to reflect new haplotypes and SVs.
- Refresh genomic prediction models
- Refit models using updated genotype data, especially for traits linked to improved regions.
- Benchmark prediction accuracy before and after the upgrade to quantify impact.
A practical approach is to run a pilot project: one high-value trait, one crop or breed, a subset of germplasm, and a new high-quality reference. Lessons learned from the pilot then scale to the rest of the pipeline.
Case snapshots: how high-quality references already change breeding decisions
Below are three published examples showing how telomere-to-telomere (T2T) and haplotype-aware assemblies translate into practical breeding value. These are independent studies from the literature, not projects run by CD Genomics, but they illustrate the kinds of outcomes breeders can target when investing in high-quality reference genomes.
Case 1 – T2T melon reference sharpens QTL for fruit quality and stress traits
A recent Horticulture Research study assembled a T2T genome for an elite Cucumis melo var. inodorus melon line, delivering 12 essentially gap-free chromosomes and a highly contiguous reference.
The team then:
- Collected 1,294 published QTL from 67 previous studies covering traits such as fruit sugar content, flesh firmness, disease resistance and abiotic stress tolerance.
- Projected all these QTL onto the new T2T melon reference.
- Identified 20 "meta-QTL" where multiple QTL overlapped, and substantially narrowed the confidence intervals, making it much easier to nominate candidate genes behind key traits.
Breeding takeaway
- With a T2T genome assembly, years of dispersed QTL mapping data can be re-anchored onto a single, accurate reference.
- Instead of dozens of broad intervals per trait, breeders gain a small number of stable, refined meta-QTL plus a shortlist of structural variants and genes to follow in marker development or genomic selection.
- This is exactly the kind of "second life" many breeding programs can give to existing QTL and GWAS results once a premium reference genome is available.
For breeders, a similar T2T genome assembly service from CD Genomics (for research use only; not provided to individual consumers) can play the same role: consolidating legacy mapping data and focusing downstream marker work on a tractable set of candidate regions.
Case 2 – Melon pan-genome and SV-based GWAS resolve sugar, color and rind traits
In a companion Plant Physiology study, another group constructed a high-quality melon pan-genome by integrating a chromosome-level genome of a local landrace with multiple previously published melon assemblies.
Key points from that work:
- The pan-genome captured around 3.4 million genetic variants, including SNPs and presence/absence variants (PAVs). PAVs were especially enriched in genes involved in sucrose metabolism and domestication-related traits.
- Using a structural variation-based GWAS, the authors linked specific structural variants to fruit sweetness, flesh color, rind striping and suture traits, showing that copy-number and presence/absence events in key genes help explain these agronomic differences.
- Through bulked-segregant analysis and map-based cloning, they pinpointed a single gene (CmPIRL6) whose allelic state determines whether the outer rind is edible or not, again tied back to structural variation in the pan-genome context.
Breeding takeaway
- A haplotype-resolved pan-genome makes it possible to move beyond single-reference SNP scans and directly test structural variants that often underlie big, visible trait differences.
- For breeding teams already running GS or GWAS on a single draft genome, upgrading to a pan-genome or haplotype-resolved reference can reveal new, structurally defined markers that explain variation missed by SNPs alone.
- CD Genomics' de novo assembly and structural-variation analysis services (research-use only) can support similar SV-based GWAS or trait dissection projects in other crops where sweetness, color or quality traits are major breeding targets.
Case 3 – Near T2T Mongolian cattle genome informs marbling and population selection
On the livestock side, a GigaScience data note reported a near telomere-to-telomere genome assembly of Mongolian cattle, a locally adapted Bos taurus breed valued for meat quality and stress tolerance.
The study:
- Generated a ~3.1 Gb assembly with only 56 contigs and a contig N50 above 110 Mb, representing a dramatic improvement in continuity over the standard cattle reference.
- Used this new reference to anchor resequencing data from 332 individuals across 56 breeds, building a dense variant database with over 100 million SNPs plus indels.
- Identified candidate genomic regions and genes, particularly on chromosome 12, that are associated with beef marbling patterns, providing concrete targets for selection while also informing conservation of local genetic resources.
Breeding takeaway
- For cattle and other livestock, a near-T2T or T2T reference does more than tidy up the assembly – it anchors population-scale variation and reveals haplotypes that can be tracked in genomic selection indices.
- Traits like marbling, fertility or adaptation often involve complex structural variation and repetitive regions that draft genomes do not represent well. Those regions become accessible once a high-quality reference is available.
- CD Genomics can provide similar high-quality reference genome assembly and population resequencing support for research-oriented cattle and livestock breeding programs, again strictly for institutional research use and not for individual consumer testing.
How do you turn a T2T plant genome into routine breeding tools?
You turn a T2T or haplotype-resolved assembly into routine breeding tools by embedding it into training populations, marker strategies, and functional genomics projects rather than treating it as a one-off experiment.
Key elements:
- Training populations for genomic selection
- Use the new reference to better characterize diversity, LD structure, and haplotype blocks in training sets.
- Ensure that important structural variants and presence–absence variation are represented and tagged.
- Long-term marker strategies
- Design markers that are stable over expected future reference updates.
- Favor haplotype or SV-based markers around major loci, which tend to be more robust than single-SNP tags.
- Better integration with functional genomics and epigenomics
- Align transcriptomics, chromatin accessibility, and epigenomic sequencing data to the more complete genome.
- Use this information to prioritize which haplotypes and structural variants to promote or purge in breeding schemes.
In practice, the reference stops being a "project" and becomes the shared coordinate system across geneticists, breeders, bioinformaticians, and external partners.
FAQs on T2T and haplotype-resolved genomes for breeding
Do I really need a T2T genome, or is a high-quality draft enough?
For many day-to-day genomic selection and marker-assisted selection workflows, a high-quality curated reference is sufficient. Consider full T2T genome assembly when key trait loci sit in poorly resolved regions, when your genome is highly repetitive or polyploid, or when you are building long-term pan-genome and structural variant resources.
What species benefit most from haplotype-resolved genome assembly in crops and livestock?
Haplotype-resolved genome assembly in crops and livestock is especially valuable when genomes are highly heterozygous, polyploid, or structurally complex. Examples include many fruit trees, clonally propagated crops, polyploid cereals, and outbred animal breeds where consensus references can hide important allelic combinations.
Can I reuse my existing SNP array or GBS data with a new reference?
Yes. Most teams lift over existing markers to the new reference and re-annotate them. Some markers will need to be retired or replaced, but many can be reused. It is good practice to re-evaluate marker performance and trait linkage after the lift-over.
How long does a T2T or haplotype-resolved project take?
Timelines depend on genome size, DNA quality, data types, and QC criteria. The process typically includes DNA extraction, sequencing, assembly, polishing, phasing, and validation. When you work with a provider such as CD Genomics, ask for a clear breakdown of milestones and deliverables rather than a single headline duration. All services are for research use only and are not intended for individual or clinical decision-making.
How does a high-quality reference genome for breeding help downstream epigenomic and transcriptomic studies?
A high-quality or telomere-to-telomere plant genome improves mapping for epigenomic sequencing, methylation profiling, and transcriptomics, especially in repeat-rich regions and near centromeres. This leads to more reliable identification of regulatory elements and gene expression patterns, which helps prioritize targets for breeding and functional validation.
Ready to move from draft to T2T?
Draft references helped bring genomics into breeding, but many programs now reach the ceiling of what fragmented assemblies can offer. T2T genome assembly and haplotype-resolved assemblies are practical tools to:
- Sharpen QTL and GWAS resolution.
- Reveal structural variants that matter for real traits.
- Enable haplotype-based breeding strategies and more stable marker systems.
If you recognize symptoms such as stalled fine mapping, inconsistent GWAS peaks, or difficulty transferring markers across germplasm, it may be time to evaluate an upgrade from a working draft to a high-quality reference genome for breeding, and eventually to T2T or haplotype-resolved assemblies for key founders.
Your next steps can be simple:
- Review your current reference and marker resources for gaps and pain points.
- Identify one high-value trait or population where a better reference would clearly change decisions.
- Discuss a scoped crop or livestock de novo genome assembly, T2T Genome Sequencing, or Haplotype-Resolved Genome Sequencing & Assembly project with CD Genomics, focused on that target.
CD Genomics provides these genome assembly and sequencing solutions strictly for research use only (RUO) and does not offer testing or interpretation services for individuals or personal healthcare decisions. A well-designed reference upgrade is a one-time infrastructure investment that supports every downstream dataset—genotyping, epigenomics, and functional studies—for many breeding cycles to come.
Figure 6. CD Genomics genome assembly services for breeding.
References
- Nurk, S., Koren, S., Rhie, A. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).
- Rhie, A., McCarthy, S.A., Fedrigo, O. et al. Towards complete and error-free genome assemblies of all vertebrate species. Nature 592, 737–746 (2021).
- Bayer, P.E., Golicz, A.A., Scheben, A., Batley, J., Edwards, D. Plant pan-genomes are the new reference. Nature Plants 6, 914–920 (2020).
- Della Coletta, R., Qiu, Y., Ou, S., Hufford, M.B., Hirsch, C.N. How the pan-genome is changing crop genomics and improvement. Genome Biology 22, 3 (2021).
- Bevan, M.W., Uauy, C., Wulff, B.B.H. et al. Genomic innovation for crop improvement. Nature 543, 346–354 (2017).
- Wei, M., Huang, Y., Mo, C. et al. Telomere-to-telomere genome assembly of melon (Cucumis melo L. var. inodorus) provides a high-quality reference for meta-QTL analysis of important traits. Horticulture Research 10, uhad189 (2023).
- Lyu, X., Xia, Y., Wang, C. et al. Pan-genome analysis sheds light on structural variation-based dissection of agronomic traits in melon crops. Plant Physiology 193, 1330–1348 (2023).
- Su, R., Zhou, H., Yang, W. et al. Near telomere-to-telomere genome assembly of Mongolian cattle: implications for population genetic variation and beef quality. GigaScience 13, giae099 (2024).
- Li, G., Tang, L., He, Y. et al. The haplotype-resolved T2T reference genome highlights structural variation underlying agronomic traits of melon. Horticulture Research 10, uhad182 (2023).
- Wang, L., Ma, Y., Zhang, R. et al. Telomere-to-telomere and haplotype-resolved genome assembly of the Chinese cork oak (Quercus variabilis). Frontiers in Plant Science 14, 1290913 (2023).
- Maung, T.Z., Yoo, J.-M., Chu, S.-H., Kim, K.-W., Chung, I.-M., Park, Y.-J. Haplotype variations and evolutionary analysis of the granule-bound starch synthase I gene in the Korean World Rice Collection. Frontiers in Plant Science 12, 707237 (2021).
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.