banner
Choosing Between LC-WGS, WGS, GBS, and SNP Arrays for Breeding and Genomic Selection Projects

Choosing Between LC-WGS, WGS, GBS, and SNP Arrays for Breeding and Genomic Selection Projects

Inquiry

TL;DR – Quick Platform Picker for Genomic Selection

  • Choose LC-WGS when you run large breeding populations for genomic selection, have a reasonably good reference genome, and can support imputation.
  • Choose WGS when you need variant discovery, reference panels, or fine mapping rather than routine candidate genotyping.
  • Choose GBS when budget is tight, per-sample cost must be very low, and some missing data are acceptable.
  • Choose SNP arrays when you need robust, standardized marker sets across many years, locations, and partners.

This guide offers a practical genotyping technology comparison and explains how to choose between LC-WGS, WGS, GBS, and SNP arrays – including LC-WGS vs GBS and LC-WGS vs SNP arrays – for real breeding and genomic selection projects.

Infographic on selecting genotyping platforms (WGS, LC-WGS, GBS, SNP arrays) based on breeding objectives, marker density, and cost to maximize annual genetic gain through genomic selection. Figure 1. Overview of how WGS, LC-WGS, GBS, and SNP arrays support genomic selection and genetic gain.

Why Genotyping Platform Choice Matters for Breeding and Genomic Selection

Choosing the right genotyping platform is one of the most impactful decisions in a modern breeding program. The platform you pick determines how many candidates you can genotype, how dense your markers are, and how reliable your genomic predictions will be.

Genomic selection is a breeding approach that uses genome-wide markers and statistical models to predict the performance of candidates and select parents earlier and more efficiently. Marker density, error rate, imputation quality, and cost per sample all shape prediction accuracy and, ultimately, genetic gain per year and per dollar.

If marker density is too low, you may miss key QTLs and underestimate relationships among lines. If per-sample cost is too high, you may be forced to reduce population size and lose selection intensity. The optimal genotyping strategy balances marker quality with population size for each phase of your breeding pipeline.

How Genomic Selection Uses Markers to Rank Candidates

Genomic selection workflow: prediction models trained on a population with phenotypes and genome-wide markers estimate genomic breeding values for new candidates using genotype data alone. (Budhlakoti M. et al. (2022) Frontiers in Genetics) Figure 2. Conceptual workflow of genomic selection: a training population with both phenotypes and genome-wide markers is used to train prediction models, which then estimate genomic breeding values for new candidates based only on their genotypes. (Budhlakoti M. et al. (2022) Frontiers in Genetics)

In most plant and animal programs, genomic selection links three components:

  • Genotypes for a training population, obtained from LC-WGS, WGS, GBS, or SNP arrays
  • High-quality phenotypes for traits such as yield, disease resistance, or quality
  • A prediction model that relates genetic markers to those traits

Once the model is trained, you can apply a cost-effective genotyping platform—often LC-WGS, GBS, or SNP arrays—to thousands of candidates, compute genomic estimated breeding values (GEBVs), and select parents without waiting for full multi-year field data.

What Is the Best Genotyping Platform for Genomic Selection?

There is no single "best" genotyping platform for genomic selection. The right choice depends on marker density requirements, population size, reference genome quality, trait architecture, and budget. In practice, many successful programs combine WGS for discovery and reference panels with LC-WGS, GBS, or SNP arrays for routine genotyping.

What LC-WGS, WGS, GBS, and SNP Arrays Actually Do

This section explains what each platform does in simple terms so that PIs, breeding project leads, and technical decision-makers can compare them clearly.

Low-Coverage Whole-Genome Sequencing (LC-WGS)

LC-WGS is whole-genome sequencing at low depth (for example 0.5–4×) combined with genotype imputation to obtain dense SNP genotypes across the genome.

Instead of reading every base many times, LC-WGS "skims" the genome. The raw data are too thin to call genotypes at all sites confidently, so imputation methods fill in the gaps using a reference panel—often constructed from WGS of founders and key lines. When the reference panel matches the breeding germplasm, imputed LC-WGS can provide millions of SNPs per sample at moderate cost.

Typical use cases for an LC-WGS sequencing service for genomic selection include:

  • Large genomic selection cohorts in crops or livestock
  • Multi-trait selection in well-characterized germplasm pools
  • Programs that already invested in a high-quality reference genome and WGS-based reference panel

Whole-Genome Resequencing (WGS)

WGS is deep sequencing of the entire genome to capture most genetic variants present in an individual.

Because it aims to detect nearly all SNPs and structural variants, WGS has the highest information content and usually the highest cost per sample among the four platforms. In breeding and genomic selection, WGS is typically applied to a smaller set of lines rather than all candidates.

Common applications for a whole-genome resequencing service:

  • Variant discovery in founders, donors, and important cultivars
  • Building reference panels for imputation of LC-WGS or GBS data
  • Supporting high-quality reference genome assembly and annotation
  • Fine mapping and detailed haplotype analyses for complex traits

In short, WGS lays the foundation that makes LC-WGS and GBS more powerful.

Genotyping-by-Sequencing (GBS)

GBS is a reduced-representation sequencing approach that uses restriction enzymes and barcoded adapters to sample SNPs across the genome at low cost.

Classic genotyping-by-sequencing (GBS) library design illustrating the coordinated action of restriction enzymes, barcoded adapters, and PCR primers to create multiplexed, reduced-representation libraries for SNP discovery and genotyping. (Elshire R.J. et al. (2011) PLOS ONE) Figure 3. Example of a classic genotyping-by-sequencing (GBS) library design, showing how restriction enzymes, barcoded adapters, and PCR primers work together to generate multiplexed, reduced-representation libraries for SNP discovery and genotyping. (Elshire R.J. et al. (2011) PLOS ONE)

Instead of sequencing entire genomes, GBS digests DNA, selects fragments around enzyme cut sites, and sequences those fragments for many samples in parallel. The result is a sparse but genome-wide marker set, often with substantial missing data that can be partially imputed.

Typical uses for a GBS genotyping service for plant breeding populations include:

  • Very large early-generation populations where cost per sample must be minimal
  • Species where high-quality SNP arrays are not yet available
  • Diversity panels, pre-breeding, and broad population structure studies
  • Entry-level genomic selection pipelines in crops with less genomic infrastructure

SNP Arrays

SNP arrays are fixed microarrays that genotype a predefined set of SNPs at known genomic positions.

A chosen array design (for example, 40K, 100K, or 600K SNPs) is applied consistently across many samples. Each sample is hybridized to the array, and the platform reports genotypes at those specific markers. Arrays are known for:

  • High per-marker accuracy and reproducible genotypes
  • Standardized, well-understood lab workflows
  • Simple and mature quality control procedures

SNP arrays are widely used for:

  • Routine genomic selection in established crops
  • Seed purity testing and quality control
  • Long-term multi-environment trials where data comparability across years and collaborators is crucial

Many programs access this technology through an SNP array genotyping service for routine GS and QC or a crop-specific SNP panel service.

Cost, Marker Density, and Data Quality: Genotyping Technology Comparison

This section compares LC-WGS, WGS, GBS, and SNP arrays by cost per sample, marker density, and data quality in the context of breeding and genomic selection.

In practice, there is always a compromise between how many markers you want and how many samples you can afford. WGS provides the most information but limits sample size. GBS and arrays are cheaper per sample but provide fewer markers or more missing data. LC-WGS aims for a middle ground: dense markers at moderate cost, powered by good imputation.

Side-by-Side Overview

The table below summarizes typical characteristics. Exact numbers vary by crop, genome size, and provider, but the relative positions are consistent with published benchmarking and field experience.

Platform Typical Marker Density Relative Per-Sample Cost Coverage / Missing Data Imputation Dependence Reference Genome Requirement Best-Fit Use Cases
Whole-Genome Resequencing Millions of SNPs, genome-wide Highest High depth, low missingness Optional (for phasing, rare) Strong reference strongly preferred Discovery, reference panels, fine mapping
LC-WGS Millions after imputation Medium Low depth, high raw missingness High (central to workflow) Good reference + reference panel Large GS cohorts, multi-trait GS, GWAS + GS
GBS Tens–hundreds of thousands Low Variable depth, many missing calls Moderate to high (optional) Helpful but not mandatory Early diversity, low-budget GS, GWAS in new species
SNP array Tens–hundreds of thousands Low to medium High call rate, low missingness Low to moderate (dense arrays) Reference useful for annotation Routine GS, QC, long-running multi-environment trials

Data Quality, Missingness, and Imputation

LC-WGS and GBS rely more heavily on imputation than WGS or arrays. Imputation performs best when:

  • A representative reference panel exists (often built from WGS)
  • The reference genome is of sufficient quality
  • Sample preparation, library construction, and sequencing depth are consistent

Practical suggestions from real projects:

  • Run a pilot plate (for example 96–384 samples) at planned depth and library conditions before scaling to thousands.
  • Re-use and gradually expand your reference panel instead of building a new one every season.
  • Monitor basic QC statistics such as call rate, depth distribution, heterozygosity, and concordance with control lines genotyped by SNP array or WGS.

LC-WGS vs GBS: How to Choose for Large Breeding Populations

This section explains how to choose between LC-WGS and GBS for large plant breeding populations and genomic selection pipelines.

When LC-WGS Is the Better Choice

LC-WGS is usually preferable when:

  • You have a reasonably good reference genome and a WGS-based reference panel that represent your breeding germplasm.
  • You want relatively uniform genome-wide coverage, rather than being limited to restriction enzyme cut sites.
  • You are planning long-term genomic selection for large populations, where investing in an imputation pipeline pays off over many cycles.
  • You want to integrate LC-WGS sequencing service for genomic selection with whole-genome resequencing service data for fine mapping or rare allele tracking.

In these situations, LC-WGS often provides a better balance of marker density, cost, and flexibility than GBS.

When GBS Still Makes Sense

GBS remains a strong option in several scenarios:

  • Very large early-generation populations where the primary goal is rapid culling of weak lines and per-sample budget is extremely tight.
  • Species with large or complex genomes where WGS is difficult and GBS protocols are already well optimized.
  • Pre-breeding and diversity panels where broad structure and relatedness matter more than capturing every rare variant.
  • Programs without the resources or infrastructure to maintain a robust imputation reference panel for LC-WGS.

Here, a GBS genotyping service for plant breeding provides an accessible entry point into genomic selection and diversity analysis.

Practical Lessons from Implementation

Teams that have run both technologies often report that:

  • LC-WGS simplifies comparisons across multiple panels or related species because the approach is not tied to a specific restriction enzyme.
  • GBS data require careful handling of missingness, enzyme bias, and batch effects, especially when combining data from different years.
  • For new or orphan crops, a staged approach—GBS plus targeted WGS on key lines, then later LC-WGS—can be more realistic than launching directly into LC-WGS only.

The key is to match platform complexity to your current genomic infrastructure and bioinformatics capacity.

LC-WGS vs SNP Arrays: When Sequencing Outgrows Fixed SNP Panels

This section shows when it makes sense to move from SNP arrays to LC-WGS in genomic selection and breeding programs.

Strengths of SNP Arrays in Long-Running Programs

SNP arrays remain a very competitive choice when:

  • You already have a validated crop-specific SNP array with markers tied to key traits and QTLs.
  • You need highly standardized genotypes across many years, locations, companies, or research partners.
  • You value robust, well-established wet-lab workflows and straightforward QC metrics.
  • Your current genomic selection models already perform well using existing array data.

In these contexts, an SNP array genotyping service for routine GS and QC often gives the lowest risk and easiest implementation.

What LC-WGS Adds Beyond Fixed Arrays

LC-WGS brings additional capabilities that arrays cannot easily match:

  • Higher marker density across the genome without redesigning arrays as new loci are discovered.
  • Better coverage of rare alleles, local haplotypes, and structural variants not captured on fixed SNP panels.
  • Greater flexibility when you work with multiple germplasm pools or closely related species, where one array design might not be ideal for all.

For programs that already run arrays successfully, LC-WGS sequencing service for genomic selection can be introduced first in experimental cohorts or specific trait projects to quantify the added value.

Transition Strategies from Arrays to LC-WGS

A gradual transition reduces technical and operational risk:

  1. Genotype a core panel (founders, key lines, checks) with both WGS and your current SNP array.
  2. Use these data to build a reference panel that connects array markers to dense WGS variation.
  3. Genotype new cohorts with LC-WGS, then impute both dense genome-wide markers and the legacy array marker set.
  4. Compare predictive performance and operational costs before deciding how fast to move more pipelines from arrays to LC-WGS.

Working with a provider that offers WGS, LC-WGS, GBS, and SNP array services in a coordinated way makes this transition much smoother.

Recommended Services for This Step:

Matching Genotyping Platforms to Breeding and Genomic Selection Scenarios

This section shows how to match each genotyping platform to concrete breeding scenarios instead of treating them as isolated technologies.

Discovery, Pre-Breeding, and Reference Building

In the discovery and pre-breeding phase, the goal is to understand diversity, build solid genomic resources, and capture useful haplotypes:

  • Use a whole-genome resequencing service on founders, donors, and key lines to build a high-quality reference genome and a reference panel for imputation.
  • Apply GBS or LC-WGS to broader diversity panels to assess structure, identify useful alleles, and inform crossing strategies.
  • Where needed, connect this work to a GWAS and RAD/GBS analysis workflow for trait dissection.

Investments at this stage make all later LC-WGS, GBS, and SNP array work more informative.

Routine Genomic Selection in Large Pipelines

For routine genomic selection, the question is mainly how to maximize genetic gain per unit cost and time:

  • LC-WGS + imputation is often the best option when there is a strong reference genome, a solid reference panel, and many candidates each season.
  • GBS-centric strategies are attractive when arrays are unavailable, budgets are tight, or the species is less studied.
  • Array-centric strategies work well when high-quality SNP arrays and historical genotyping datasets already exist and need to be maintained.

A good plant breeding genotyping strategy treats platform choice, model design, and field trial structure as an integrated system rather than separate decisions.

GWAS, Diversity Panels, and Multi-Environment Trials

Requirements shift slightly for GWAS and diversity analysis:

  • WGS or high-density LC-WGS are preferred for fine mapping and for understanding local haplotypes around key loci.
  • GBS is well suited for broad diversity screens, especially in species without arrays.
  • SNP arrays are ideal for long-term multi-environment trials where data comparability across sites and years is the main constraint.

Pairing the right platform with a GWAS and genomic analysis service ensures that marker density, population design, and replication are adequate for your trait architecture.

A 4-Step Checklist for Choosing Your Genotyping Strategy

You can reuse this simple checklist whenever you start a new project:

  1. Define your population and species
    • a) Candidate numbers, ploidy, genome size, and known genomic resources.
  2. Clarify your primary goal
    • a) Discovery, routine genomic selection, GWAS, diversity, QC, or a combination.
  3. Assess your reference assets
    • a) Reference genome quality, availability of WGS data, and imputation capacity.
  4. Match to a platform or combination
    • a) Discovery and reference building: WGS + LC-WGS / GBS
    • b) Routine GS: LC-WGS sequencing service for genomic selection, or GBS/SNP arrays where appropriate
    • c) Trait dissection and GWAS: WGS or high-density LC-WGS, sometimes plus arrays

This process keeps technology choices aligned with breeding goals rather than vendor catalogs.

From Genotyping Strategy to Project Plan: How We Support Your Pipeline

This section describes what a typical end-to-end genotyping project looks like when you move from concept to execution.

What to Expect When Starting a Project

A typical project with CD Genomics as an integrated sequencing and bioinformatics provider includes:

  • A short technical consultation to review species, population size, traits, and timelines
  • Selection of the best mix of LC-WGS sequencing service for genomic selection, whole-genome resequencing service, GBS genotyping, and SNP array genotyping
  • Standardized QC reporting (read metrics, call rate, depth distribution, sample identity checks)
  • Delivery of variant calls and genotype matrices prepared for genomic prediction, GWAS, or diversity analysis

Throughout the process, unexpected issues—such as sample mix-ups or unusual relatedness patterns—can be detected and discussed before they impact field seasons.

Practical Workflow Tips from Real Projects

CD Genomics positioned centrally, linking breeding samples to genotyping services (WGS, LC-WGS, GBS, SNP arrays) that generate data for genomic selection. Figure 4. CD Genomics as an integrated provider of WGS, LC-WGS, GBS, and SNP array services for breeding and genomic selection projects.

Some recurring best practices:

  • Pilot before scaling. Test the planned platform and depth on a manageable subset before genotyping thousands of candidates.
  • Treat reference panels as strategic assets. Re-use and expand WGS-based reference panels instead of starting from scratch each time.
  • Align lab and field calendars. Build sufficient buffer into your project plan so that genotyping data arrive in time for crossing and selection decisions.

These practical steps often protect more value than a small difference in cost per sample.

Call to Action: Get a Platform Recommendation for Your Next Season

If you are still weighing LC-WGS vs GBS vs SNP arrays for your next season, a project-specific review is usually the fastest way to a clear answer.

By sharing your crop, target traits, population size, existing data (arrays, GBS, WGS), and budget range with CD Genomics, you can receive:

  • A tailored genotyping technology comparison for your program
  • One or two recommended platform combinations aligned with your objectives
  • A clear quote and project outline that fits your breeding calendar

This turns technology confusion into a concrete, actionable genotyping plan.

Key Takeaways: LC-WGS, WGS, GBS, and SNP Arrays

  • LC-WGS often provides the best balance of marker density and cost for large genomic selection cohorts when a good reference genome and reference panel are available.
  • WGS is mainly used for variant discovery, reference panel construction, and fine mapping—not for routine genotyping of all candidates.
  • GBS is a cost-effective choice for very large populations, pre-breeding, and crops where SNP arrays are not yet established.
  • SNP arrays remain robust and easy to manage for long-running breeding programs, QC workflows, and multi-environment trials.
  • Most successful breeding programs combine at least two platforms over time (for example, WGS + LC-WGS or WGS + GBS + arrays) rather than relying on a single technology.

FAQs: Common Questions About LC-WGS, WGS, GBS, and SNP Arrays

Q1. Do I need a reference genome to use LC-WGS or GBS for genomic selection?

A high-quality reference genome is strongly recommended for LC-WGS because imputation accuracy depends on how well reference haplotypes represent your breeding germplasm. GBS can be used without a reference, but having one improves alignment, variant calling, and GWAS interpretation.

Q2. Is LC-WGS accurate enough for genomic selection compared with SNP arrays?

When designed properly—with adequate coverage, a representative reference panel, and a well-tuned imputation pipeline—LC-WGS can provide genomic prediction accuracy comparable to or higher than dense SNP arrays in many crops. Accuracy depends on the combination of marker density, training population size, and trait architecture rather than the platform alone.

Q3. Which is cheaper for large breeding populations, LC-WGS or GBS?

On a per-sample basis, GBS is often slightly cheaper than LC-WGS, especially at high multiplex levels. However, LC-WGS can deliver denser markers and may reduce other costs (such as repeated panel redesign) over time. The most economical option depends on your required marker density, available reference resources, and the number of lines genotyped each season.

Q4. How many SNP markers do I need for genomic selection in crops?

There is no universal threshold. Many crop programs obtain useful prediction accuracy with tens of thousands of well-distributed markers, while some complex traits and diverse germplasm panels benefit from hundreds of thousands of SNPs. Instead of chasing an absolute number, aim for sufficient coverage of the genome, a well-designed training population, and consistent genotyping quality.

Q5. Can I mix platforms across cycles or panels in the same breeding program?

Yes. Many programs successfully combine WGS for reference panels, LC-WGS or GBS for routine cohorts, and SNP arrays for specific QC or legacy datasets. The important step is to harmonize data across platforms—usually by imputing onto a common marker set—so that genomic prediction models see consistent genotypes over time.

References

  1. Meuwissen, T.H.E., Hayes, B.J., Goddard, M.E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157(4), 1819–1829 (2001).
  2. Crossa, J., Pérez-Rodríguez, P., Cuevas, J. et al. Genomic selection in plant breeding: methods, models, and perspectives. Trends in Plant Science 22(11), 961–975 (2017).
  3. Budhlakoti, N., Kushwaha, A.K., Rai, A. et al. Genomic selection: a tool for accelerating the efficiency of molecular breeding for development of climate-resilient crops. Frontiers in Genetics 13, 832153 (2022).
  4. Rasheed, A., Hao, Y., Xia, X. et al. Crop breeding chips and genotyping platforms: progress, challenges, and perspectives. Molecular Plant 10(8), 1047–1064 (2017).
  5. Elshire, R.J., Glaubitz, J.C., Sun, Q. et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLOS ONE 6(5), e19379 (2011).
  6. Sthapit, S.R., Crain, J., Larson, S. et al. A low-coverage skim-sequencing and imputation pipeline for genomic selection. The Plant Genome 18(4), e70139 (2025).
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
Send a MessageSend a Message

For any general inquiries, please fill out the form below.

For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
We provide the best service according to your needs Contact Us
OUR MISSION

CD Genomics is propelling the future of agriculture by employing cutting-edge sequencing and genotyping technologies to predict and enhance multiple complex polygenic traits within breeding populations.

Contact Us
Copyright © CD Genomics. All Rights Reserved.
Top