banner
Genomic Selection in Plant and Animal Breeding: Principles, Models, and Real-World Applications

Genomic Selection in Plant and Animal Breeding: Principles, Models, and Real-World Applications

Inquiry

TL;DR – What this article covers and why it matters

Genomic selection is a breeding strategy that uses genome-wide DNA markers and statistical genomic prediction models to estimate breeding values before full field testing. By combining dense genotyping with high-quality phenotypes in a training population, breeders can predict genomic estimated breeding values (GEBV) for thousands of candidates and select earlier, faster, and with more confidence. This guide explains how genomic selection in plant breeding and animal breeding works in practice, how a genomic selection pipeline is built, how to choose between LC-WGS, GBS, and SNP genotyping for genomic selection projects, and which design choices most strongly affect prediction accuracy and genetic gain.

Key takeaways

  • Genomic selection uses genome-wide markers and genomic prediction models to rank candidates via GEBV and can be powered by LC-WGS, GBS, or SNP genotyping.
  • A well-designed training population and clear trait definitions drive most of the prediction accuracy.
  • LC-WGS, GBS, and SNP arrays each fill different niches as genotyping platforms for genomic selection.
  • Genomic selection complements, rather than replaces, GWAS and marker-assisted selection in modern breeding.

Schematic of a genomic selection workflow encompassing training population phenotyping/genotyping, genomic prediction modeling, GEBV-based selection, and enhanced genetic gain in crops and livestock. Figure 1. Genomic selection pipeline summarized.

Why Genomic Selection Is Changing Plant and Animal Breeding

Genomic selection is changing breeding because it links genome-wide marker data directly to complex traits such as yield, fertility, health, and resilience. It provides a way to increase genetic gain per year without multiplying trial size or cost.

Traditional phenotypic selection relies on multi-year, multi-location trials to evaluate breeding candidates. For many crops and livestock species, a single selection cycle can take five to ten years, especially when you include seed or stock multiplication and regulatory steps. In the meantime, disease pressures, climate conditions, and market demands continue to evolve.

Genomic selection shortens this feedback loop. Young plants or juvenile animals are genotyped early, and genomic prediction models convert their marker profiles into GEBV. Instead of waiting for complete field performance, you can make informed decisions much earlier in the breeding cycle.

Several dairy cattle breeding programs, for example, have reported higher rates of genetic gain per year and shorter generation intervals after adopting genomic selection compared with previous schemes based on pedigree and phenotype alone. Similar trends are now seen in maize, wheat, rice, and other crops, where genomic prediction improves the efficiency of variety development and line recycling.

From the conversations we have with breeding and R&D teams, the main pain points are:

  • Long breeding cycles and slow response to new diseases or climate stresses.
  • Traits are controlled by many small-effect loci, where marker-assisted selection (MAS) alone is not sufficient.
  • Limited budgets for extensive phenotyping across many environments and years.

A genomic selection project, when designed carefully, addresses these issues by:

  • Using genome-wide markers to capture the combined effect of many loci on complex traits.
  • Using each phenotyped plot or animal multiple times via genomic prediction, instead of once.
  • Allowing earlier culling and selection decisions based on GEBV, not only on raw field performance.

What Is Genomic Selection in Plant and Animal Breeding? Key Concepts, GEBV, and Training Populations

Genomic selection is a breeding method in which selection candidates are chosen based on breeding values predicted from genome-wide marker data and a training population that has both genotypes and phenotypes. In practice, genomic selection uses genomic prediction models to turn SNP profiles into GEBV that guide which individuals contribute to the next generation.

To work efficiently with genomic selection in plant breeding and animal breeding, it helps to understand a few core concepts.

Genomic estimated breeding values (GEBV)

A GEBV is the predicted breeding value of an individual derived from its genome-wide marker profile and a genomic prediction model trained on related individuals with known phenotypes. It plays the same role as a traditional breeding value but incorporates much more detailed genetic information than pedigree alone.

Training and validation populations

The training population is the set of individuals that have both high-quality phenotypes for target traits and genome-wide genotypes. The genomic prediction model "learns" marker–trait relationships from this group. A larger and more representative training population generally leads to higher prediction accuracy.

Summary of the most frequently employed genomic selection models. (Budhlakoti N. et al. (2022) Frontiers in Genetics) Figure 2. Overall summary of the most commonly used models in genomic selection. (Budhlakoti N. et al. (2022) Frontiers in Genetics)

A validation population is used to test how well the genomic prediction model performs on new material. It may be a subset of the training population held out in cross-validation or a separate set of lines or animals that represent future selection candidates.

Genomic prediction models

A genomic prediction model is a statistical or machine learning model that links marker data to trait values. Common approaches include GBLUP and Bayesian methods that treat marker effects as random variables. More advanced models can also incorporate genotype-by-environment interactions when multi-environment data are available.

Genomic selection vs marker-assisted selection

Marker-assisted selection usually targets a small number of known QTL or major genes. Genomic selection, in contrast, uses thousands to millions of markers distributed across the genome, without needing to identify individual QTL beforehand.

A concise way to express "genomic selection vs marker-assisted selection" is to compare focus and applications:

Aspect Marker-Assisted Selection (MAS) Genomic Selection (GS)
Genetic target Few large-effect QTL Genome-wide small + large effects
Best suited for Major disease resistance, single genes Complex traits like yield, fertility, resilience
Prior knowledge Requires mapped markers or QTL Can start from dense markers alone
Main output Marker-based decisions GEBV for ranking and selection

In practice, many modern programs use both: MAS for specific major genes and genomic selection to optimize the polygenic background.

Inside a Genomic Selection Pipeline: From Genotyping to Genomic Prediction

A genomic selection pipeline can be broken down into a series of practical steps. This is also how we typically structure genomic selection project discussions with breeders, CRO partners, and R&D teams.

Fundamental diagram illustrating the genomic selection process. (Budhlakoti N. et al. (2022) Frontiers in Genetics) Figure 3. Basic schema of the genomic selection process. (Budhlakoti N. et al. (2022) Frontiers in Genetics)

Step 1 – Define traits and breeding goals

Before any genotyping is ordered, clarify:

  • Target traits (for example, grain yield, protein content, mastitis resistance, feed efficiency).
  • Target environments or regions and their key stresses.
  • Relative weights if you plan to use a selection index across traits.

Clear definitions reduce noise in phenotypic data and ensure that the genomic selection project supports your strategic breeding goals.

Step 2 – Build and phenotype the training population

The training population is the foundation of genomic prediction. Good practice includes:

  • Covering the genetic diversity that matters for your program, not only current elite material.
  • Including a sufficient number of individuals; hundreds are often considered a minimum, and thousands are preferred when budgets allow.
  • Using standardized, well-documented phenotyping protocols across locations and years.

Breeders who invest early in a well-designed training population usually see better GEBV accuracy and smoother expansion of genomic selection in later cycles.

Step 3 – Choose a genotyping strategy for genomic selection

Once the training population is defined, you select a genotyping strategy for both training individuals and future selection candidates. Common options include:

  • LC-WGS for Genomic Selection (low-coverage whole genome sequencing with imputation).
  • GBS / RAD-based Genotyping using reduced-representation libraries.
  • SNP Genotyping & Molecular Breeding using fixed SNP arrays or custom SNP panels.

A dedicated section below compares these approaches specifically for genomic selection projects.

Step 4 – Fit genomic prediction models and evaluate accuracy

After genotyping, genomic prediction models are fitted for each trait of interest. Practical points to consider:

  • Use cross-validation to estimate prediction accuracy inside the training population.
  • Track trait-specific metrics, such as the correlation between observed values and GEBV or the proportion of genetic variance explained.
  • Consider models that handle genotype-by-environment interactions when you have multi-environment trials.

Clear reporting of GEBV accuracy by trait helps breeders and decision-makers see where genomic selection will have the strongest impact.

Step 5 – Select candidates based on GEBV and integrate into the breeding cycle

Once you have reliable GEBV, you can:

  • Rank selection candidates and choose parents based on GEBV or a multi-trait index.
  • Combine genomic selection decisions with field performance, disease scores, and quality assessments.
  • Shorten generation intervals by selecting at earlier stages rather than waiting for full phenotyping.

Over time, genomic selection becomes part of a recurrent selection cycle. Each generation provides new phenotype and genotype data that update the training population and improve genomic prediction models.

Choosing Genotyping Strategies for Genomic Selection: LC-WGS, GBS, and SNP Panels

Genotyping choice is one of the most common questions when planning a new genomic selection project. The genotyping platform influences marker density, genome coverage, data flexibility, and cost per sample.

Overview of main genotyping options

  • Low-coverage whole genome sequencing (LC-WGS for Genomic Selection)

    LC-WGS generates shallow reads across the entire genome and then uses genotype imputation to recover dense SNP data. It provides near-whole-genome coverage and is well suited to genomic selection and GWAS in species that lack mature SNP chips or require high marker density.

  • GBS / RAD-based Genotyping

    Genotyping-by-sequencing (GBS) and related RAD-based methods sample a fraction of the genome using restriction enzymes. They produce many markers at moderate coverage and are widely used in crops where cost per sample and throughput are key constraints.

  • SNP Genotyping & Molecular Breeding (SNP arrays/panels)

    SNP arrays use fixed sets of markers selected for important germplasm. They are robust and scalable, making them ideal for routine genotyping in species with established commercial chips or custom-designed SNP panels.

LC-WGS vs GBS vs SNP Arrays for Genomic Selection Projects

A side-by-side comparison helps breeders and technical managers choose the right strategy:

Feature LC-WGS (with imputation) GBS / RAD-based SNP arrays/panels
Genome coverage Genome-wide Subset of genome Fixed marker set
Marker density Very high after imputation Moderate to high Depends on chip design
Upfront setup Moderate Low to moderate Higher if custom chip
Per-sample cost Competitive at scale Low Low to moderate
Best fit Genomic selection, GWAS, new or minor species Early GS, diversity studies, population structure Established species, large routine GS pipelines

LC-WGS and GBS data can support both genomic selection and genome-wide association studies, which is attractive when you want to combine marker discovery with genomic prediction. SNP panels are efficient when you already know which markers work well in your germplasm and mainly need fast, routine genotyping.

Scenario-based recommendations

From actual project discussions, a few typical scenarios emerge:

  • Large plant breeding program in a major crop

    If a robust commercial SNP chip exists, a SNP panel can be a practical starting point for genomic selection in plant breeding. When higher marker density or more flexibility is needed, LC-WGS for Genomic Selection becomes appealing.

  • Livestock or aquaculture program without a strong SNP array

    LC-WGS with imputation is often attractive because it does not rely on pre-defined chip content and can expand as reference panels grow.

  • Program in a minor or orphan species

    GBS / RAD-based Genotyping offers a practical route to genomic selection in species with limited genomic resources and tight budgets.

Our LC-WGS for Genomic Selection, GBS / RAD-based Genotyping, and SNP Genotyping & Molecular Breeding solutions are designed to support these scenarios and help align genotyping with both current and future project needs.

Real-World Applications of Genomic Selection in Crops and Livestock

Genomic selection is now used across numerous crops, livestock species, and aquaculture programs rather than being a purely theoretical concept.

Genomic selection in major crops

Breeding programs in cereals, oilseeds, and legumes use genomic selection to:

  • Improve grain yield and stability across environments.
  • Increase resistance to key fungal, bacterial, or viral diseases.
  • Maintain or upgrade quality traits while adjusting plant architecture, maturity, or stress tolerance.

Published multi-environment studies in maize and wheat show that genomic prediction models can capture a large share of genetic variance for yield and disease resistance, especially when training populations are well structured and represent target environments. Genomic selection in plant breeding allows breeders to discard weak lines earlier and reserve costly field trials for the most promising candidates.

Architecture of a simulated 25-year wheat breeding program. (Tessema B.B. et al. (2020) Frontiers in Genetics) Figure 4. Structure of simulated wheat breeding program running over 25 years. (Tessema B.B. et al. (2020) Frontiers in Genetics)

Genomic selection in livestock and aquaculture

In dairy cattle, genomic selection was initially used for milk yield, fat and protein content, fertility, and disease resistance. It has reduced reliance on long and expensive progeny testing schemes and has increased the rate of genetic gain. Similar strategies are now deployed in beef cattle, pigs, poultry, and aquaculture, targeting growth, feed efficiency, survival, and product quality.

Both high-density SNP panels and LC-WGS are used in genomic selection in animal breeding. The choice depends on available reference resources, commercial chip options, and long-term data strategy.

Multi-trait and index-based genomic selection

Most commercial breeding programs need to improve several traits at once. Genomic selection adapts naturally to this situation:

  • Genomic prediction models are built for each trait separately.
  • A selection index combines GEBV across traits using economic weights or strategic priorities.

Many programs start with a small set of high-priority traits for genomic selection and then expand the index as they gain confidence and collect more phenotypes.

How genomic selection works with GWAS and MAS

Genomic selection does not replace GWAS or MAS; it complements them:

  • GWAS in plant breeding and livestock genetics are used to identify major QTL and candidate genes.
  • Marker-assisted selection tracks these major genes efficiently in early generations.
  • Genomic selection captures the remaining polygenic variance and optimizes the overall genetic background.

Linking genomic selection projects with existing GWAS and marker-assisted selection efforts can improve both efficiency and biological insight.

Design Tips, Limitations, and Common Pitfalls in Genomic Selection Projects

Well-designed genomic selection projects share several features. Below are practical design tips and realistic caveats based on published studies and project experience.

Practical design tips

  • Start with a small set of traits where prediction is likely to work

    Traits with moderate to high heritability and stable scoring protocols are good candidates for early genomic selection.

  • Invest in training population diversity and quality

    Ensure the training population covers the germplasm that matters for your future candidates. Include elite lines, key donors, and materials relevant to target markets or production systems.

  • Use proper replication and field design

    Genomic models can handle statistical noise only to a certain degree. Good design and replication remain essential for reliable trait values.

  • Plan data management early

    Align on data formats, sample naming, and metadata standards before genotyping. This reduces errors and speeds up downstream analysis and bioinformatics.

Limitations and what genomic selection cannot do

Setting realistic expectations is important:

  • Genomic selection does not remove the need for field trials. You still need well-designed trials to build and update the training population and confirm the performance of top candidates.
  • Prediction accuracy varies by trait and program. It depends on trait architecture, heritability, environment, and relatedness between training and target materials.
  • Genomic selection may be less accurate for new donors that are genetically distant from the training population. Additional data or specific donor-focused training sets may be necessary.

Common pitfalls and how to avoid them

Frequently observed issues include:

  • Too small training population

    Trying to run genomic selection with only a few dozen individuals usually leads to unstable models. Whenever possible, aim for at least a few hundred individuals in the training population.

  • Inconsistent phenotyping

    Changes in scoring scale, observer bias, or uncontrolled field heterogeneity will reduce GEBV accuracy. Establish clear phenotyping SOPs and training for technicians.

  • Ignoring genotype-by-environment interactions

    If target environments differ strongly, consider building separate models for each region or using models that include environmental covariates.

  • Choosing a genotyping platform without considering future needs

    Selecting the absolute lowest-cost option may be false economy if it cannot support future GWAS, new traits, or new germplasm. Comparing LC-WGS for Genomic Selection, GBS / RAD-based Genotyping, and SNP Genotyping & Molecular Breeding with your long-term goals in mind is important.

When to combine genomic selection with other tools

Genomic selection becomes even more powerful when combined with:

  • High-quality reference genomes, including T2T and haplotype-resolved assemblies, which improve marker placement and detection of structural variants.
  • Targeted GWAS and MAS for specific genes or QTL, where simple assays can deliver major gains.

If you plan to upgrade your reference resources, it is worth considering how T2T and haplotype-resolved genomes can support future genomic selection and GWAS projects.

How to Get Started with a Genomic Selection Project with CD Genomics

If genomic selection is new to your program, the first step is usually a structured discussion rather than an immediate sequencing order. A short checklist can help you prepare.

A simple "getting started" checklist

Before launching a genomic selection project, try to clarify:

  • Species and population: crop or animal, population structure, typical crossing schemes.
  • Target traits and environments: which traits matter most and in which production systems.
  • Existing data: historical phenotypes, genotypes, pedigrees, or previous GWAS results.
  • Sample numbers: approximate numbers for the training population and initial candidate sets.
  • Budget range and timelines: approximate constraints that will shape platform choice and design.

The clearer these points are, the easier it is to design a realistic genotyping and genomic prediction plan.

How our services support each step of the genomic selection pipeline

CD Genomics provides sequencing and analysis services that map directly to the genomic selection pipeline:

  • Genotyping for genomic selection
    • LC-WGS for Genomic Selection for dense, flexible, genome-wide markers.
    • GBS / RAD-based Genotyping when low cost per sample and high throughput are critical.
    • SNP Genotyping & Molecular Breeding when robust fixed panels are the best fit for your species.
  • Analysis and genomic prediction
    • GWAS in plant breeding or livestock to identify major loci when needed.
    • Genomic prediction model building, cross-validation, and reporting of GEBV accuracy.
    • Custom bioinformatics and data management workflows aligned with your breeding pipeline.

By aligning genotyping and bioinformatics with your breeding goals, you can move from conceptual interest in genomic selection to a concrete project with clear timelines and deliverables.

CD Genomics genomic selection implementation workflow, detailing steps from defining breeding objectives to developing customized genotyping and prediction strategies. Figure 5. Genomic selection startup workflow with CD Genomics, from breeding goals to a tailored genotyping and prediction plan.

Call to action

If you are considering genomic selection in plant or animal breeding, the next step can be a brief consultation with our technical team. Share your current population structure, target traits, and available data, and we can help you decide whether LC-WGS for Genomic Selection, GBS / RAD-based Genotyping, or SNP Genotyping & Molecular Breeding is the best starting point, and what sample sizes and timelines are realistic for your first genomic selection cycle.

Frequently Asked Questions About Genomic Selection

1. How many individuals do I need in a training population for genomic selection?

There is no single magic number, but prediction accuracy generally improves as training population size increases and begins to stabilize when several hundred individuals are included and they represent your target germplasm. For complex traits, many programs aim for 500–2,000 individuals in the training population when budget and logistics allow.

2. Do I need a reference genome to start genomic selection in my species?

A good reference genome is very helpful, especially for LC-WGS and GWAS, but it is not mandatory for every genomic selection project. GBS / RAD-based Genotyping can be used in species with limited genomic resources, and SNP arrays can work from known marker sets. Over time, however, investing in a better reference often pays off through improved marker placement and more robust genomic prediction.

3. Is genomic selection only useful for high-heritability traits?

Genomic selection tends to perform best for traits with moderate to high heritability, but it can still be useful for lower-heritability traits if you have large, well-phenotyped training populations. For very noisy traits, improving phenotyping and field design usually brings larger gains in prediction accuracy than changing the statistical model.

4. How does genomic selection work in cross-pollinated vs self-pollinated crops?

The core principles are similar, but implementation details differ. In cross-pollinated crops, genomic selection often focuses on recurrent population improvement and hybrid prediction, where relatedness patterns and heterozygosity matter. In self-pollinated crops, genomic selection is frequently applied to lines at inbred or near-inbred stages, and training populations can be designed to mirror the line development pipeline. In both cases, the training population should represent the germplasm you intend to improve.

5. Can I start with a small pilot genomic selection project and scale up later?

Yes. Many programs start with a pilot phase, using a moderate training population and one or two key traits to test genomic selection in their own context. The pilot helps calibrate expectations for prediction accuracy and cost. When planning a pilot, it is still worth choosing genotyping platforms—such as LC-WGS for Genomic Selection or GBS / RAD-based Genotyping—that can scale to thousands of samples as your program grows.

References

  1. Fugeray-Scarbel, A., Bastien, C., Dupont-Nivet, M. et al. Why and how to switch to genomic selection: Lessons from plant and animal breeding experience. Front Genet 12, 629737 (2021).
  2. Budhlakoti, N., Kushwaha, A.K., Rai, A. et al. Genomic selection: A tool for accelerating the efficiency of molecular breeding for development of climate-resilient crops. Front Genet 13, 832153 (2022).
  3. Tessema, B.B., Liu, H., Sørensen, A.C. et al. Strategies using genomic selection to increase genetic gain in breeding programs for wheat. Front Genet 11, 578123 (2020).
  4. Meuwissen, T.H.E., Hayes, B.J., Goddard, M.E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829 (2001).
  5. Hayes, B.J., Bowman, P.J., Chamberlain, A.J. et al. Invited review: Genomic selection in dairy cattle: Progress and challenges. J Dairy Sci 92, 433–443 (2009).
  6. Voss-Fels, K.P., Cooper, M., Hayes, B.J. Accelerating crop genetic gains with genomic selection. Theor Appl Genet 132, 669–686 (2019).
  7. Biswas, P.S., Ahmed, M.M.E., Afrin, W. et al. Enhancing genetic gain through the application of genomic selection in developing irrigated rice for the favorable ecosystem in Bangladesh. Front Genet 14, 1083221 (2023).
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
Send a MessageSend a Message

For any general inquiries, please fill out the form below.

For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
We provide the best service according to your needs Contact Us
OUR MISSION

CD Genomics is propelling the future of agriculture by employing cutting-edge sequencing and genotyping technologies to predict and enhance multiple complex polygenic traits within breeding populations.

Contact Us
Copyright © CD Genomics. All Rights Reserved.
Top