Population Evolution and GWAS in Carrot: Linking Domestication to Carotenoid Genes
Population evolution analysis and genome-wide association studies (GWAS) sit at the heart of modern crop population genomics. When you combine population evolution analysis with carrot GWAS or similar designs, you can track domestication, explain trait variation, and reveal key carotenoid genes. These same concepts also guide the choice of population genetics sequencing services for real-world projects in breeding and crop improvement.
Rather than starting from methods, this article first highlights what such studies deliver for breeders and researchers. Then it walks through a practical study design for population evolution plus GWAS, followed by a detailed carrot case study that connects genomic signatures of domestication with high-carotenoid orange roots.
Why combine population evolution analysis with GWAS in crops?
Researchers rarely ask only "Where is the QTL?" anymore. They also want to know "How did this locus evolve, and what did selection do to it?"
Using population evolution analysis alongside GWAS allows you to:
- reconstruct domestication and breeding histories with genome-wide markers
- quantify genetic diversity and selection signatures across populations
- link those historical signals to loci that control yield, quality, and resilience
For R&D teams, this joined-up view makes it easier to prioritise causal genes, decide which alleles to track in breeding pipelines, and justify investment in follow-up experiments.
How should you design a population evolution + GWAS study?
Successful crop projects usually stand on three pillars: diverse genetic materials, reliable phenotyping, and a realistic sequencing strategy.
1. Building a genetically diverse panel
Your panel needs enough contrast to capture both evolutionary signals and trait associations. Diversity should reflect:
- Geography and environment – accessions from distinct regions, climates, or production systems
- Breeding history – wild relatives, multiple landrace groups, early cultivars, and modern elite lines
- Morphology and quality – variation in colour, plant architecture, yield components, and stress responses
As a practical rule, at least 200 accessions per species provide useful statistical power for both population analyses and GWAS. Larger panels help resolve fine-scale structure and detect smaller-effect loci.
2. Phenotyping agronomic traits with care
Genotypes are only half of the story. In parallel with DNA extraction, traits of interest should be measured with a clear protocol. Common examples include:
- plant height, biomass, and canopy architecture
- yield and sub-traits such as seed size or root weight
- resistance or tolerance to diseases and abiotic stresses
- quality metrics such as carotenoid levels, sugar content, or oil percentage
Whenever possible, score traits across multiple environments or years. This design helps separate genetic effects from environmental noise and makes GWAS results more robust.
3. Choosing sequencing type and depth
Most population evolution + GWAS projects now use whole-genome resequencing, especially for small or medium genomes. This approach offers dense SNP coverage and flexible downstream analysis.
Key practical points:
- Around 10× average coverage per accession usually balances cost and variant calling quality.
- For very large panels or very large genomes, depth can be reduced slightly if you have a good reference genome and plan to use genotype imputation.
The resulting SNP matrix underpins all later steps, from population structure analysis to association mapping.
What does a population evolution + GWAS workflow look like?
Overview of a population evolution and GWAS workflow in carrot. (Coe et al., 2023, Nat Plants)
Although details vary by project, a typical workflow follows these stages:
Step 1 – Whole-genome resequencing
- Generate short-read sequencing data for each accession.
- Align reads to a high-quality carrot reference genome or another relevant assembly.
Step 2 – Variant detection and cleaning
- Call SNPs and, if needed, small insertions, deletions, or structural variants.
- Apply strict filters on depth, quality, and missing data to obtain a final high-confidence variant set.
Step 3 – Population evolution analyses
With the filtered SNP matrix, you can investigate how the population has changed over time:
- Population structure and admixture to define major genetic groups
- Genetic diversity metrics such as nucleotide diversity and heterozygosity within and between groups
- Linkage disequilibrium (LD) decay to infer recombination patterns and past selection
- Gene flow and demographic history, including bottlenecks and expansions, using coalescent or site-frequency-spectrum approaches
- Integrated population evolution models that connect these signals into a narrative of domestication and breeding
Step 4 – GWAS for traits of interest
Because GWAS uses the same SNP calls, you can map traits without extra sequencing:
- Combine genotypes with phenotypes such as root colour or carotenoid content.
- Use mixed models that correct for population structure and relatedness.
- Detect genomic regions and specific loci associated with the traits.
Step 5 – Biological interpretation and breeding strategy
The real value comes when you overlay the two result sets:
- Regions under selection can be compared with GWAS peaks.
- Candidate genes supported by both analyses gain higher priority.
- Breeders can design marker-assisted selection or genomic selection schemes around these key loci.
Case study: how carrot population genomics uncovered carotenoid genes
Title: Population genomics identifies genetic signatures of carrot domestication and improvement and uncovers the origin of high-carotenoid orange carrots
Journal: Nature Plants
Publication date: September 2023
DOI: 10.1038/s41477-023-01526-6
What biological question did the study address?
Carrot (Daucus carota L.) is a major vegetable crop and a leading source of provitamin A carotenoids in the human diet. Modern breeding has significantly improved both yield and nutritional quality over recent decades.
Global germplasm collections hold:
- wild carrots,
- multiple landrace groups,
- early cultivated types, and
- current improved varieties,
all showing rich diversity in colour, shape, and composition.
Historically, carrots are divided into two broad types:
- Eastern carrots, purple or yellow forms domesticated early in Asia Minor and Central Asia
- Western carrots, mainly orange, which emerged in Europe in the seventeenth century and later dominated worldwide production
The study set out to ask two linked questions:
- How did domestication and breeding reshape carrot genomes?
- Which loci and genes underlie the high carotenoid content of orange carrots?
To answer both, the authors combined large-scale population evolution analysis with GWAS.
What materials and sequencing design were used?
- Panel: 630 carrot accessions, spanning wild populations, different landrace clusters, early cultivars, and modern improved cultivars.
- Sequencing: Whole-genome resequencing for all 630 accessions.
- Variants: After stringent filtering, the team retained 25,375,112 high-quality SNPs across the carrot genome.
This dense variant set supported high-resolution analysis of both evolutionary history and trait associations.
How did the study characterise carrot population evolution?
Using the SNP matrix, the authors performed a series of complementary analyses.
Population structure
Clustering and admixture methods resolved five main genetic groups:
- wild carrots
- landrace group A
- landrace group B
- early cultivated carrots
- improved cultivated carrots
This structure reflects both geographic origins and breeding steps.
Population clustering of carrot germplasm. (Coe et al., 2023, Nat Plants)
Genetic diversity
Measures of nucleotide diversity showed that:
- wild carrots retained the highest genetic diversity
- cultivated groups had markedly lower diversity, revealing strong domestication and improvement bottlenecks
Genetic diversity and demographic analysis of carrot germplasm. (Coe et al., 2023, Nat Plants)
Linkage disequilibrium and selection signatures
LD decay curves told a similar story:
- cultivated groups showed slower LD decay than wild carrots
- this pattern is consistent with reduced effective population size and past selective sweeps during domestication and breeding
Demographic history and gene flow
Demographic modelling suggested:
- pronounced bottlenecks during domestication
- more recent population expansion driven by modern breeding programmes
Patterns of gene flow among groups clarified how different ancestral pools contributed to the cultivated gene pool.
Taken together, these analyses reconstructed the domestication and breeding history of carrot and highlighted genomic regions that have been shaped by selection.
How did GWAS identify carotenoid-related loci?
To link genotype with carotenoid traits, the authors conducted GWAS using the same SNP dataset.
- Phenotypes:
- 601 accessions had detailed root colour and quality measurements.
- 435 accessions had quantitative data for relative carotenoid content.
Mixed linear models, correcting for population structure, revealed significant association signals on chromosomes 2, 3, and 7 for carotenoid accumulation.
Candidate genes for taproot colour and carotenoid concentration identified by association mapping. (Coe et al., 2023, Nat Plants)
Within these regions, two genes stood out as strong candidates:
Variation at these loci correlated closely with carotenoid levels and helped explain the high carotenoid content of orange Western carrots.
What do these findings mean for carrot breeding?
By merging results from population evolution analysis and GWAS, the study:
- provided a detailed picture of how domestication and modern breeding sculpted carrot genomes
- pinpointed markers and candidate genes associated with carotenoid accumulation
- shed light on the origin of the high-carotenoid orange carrot types now grown worldwide
For breeding programmes, the key loci and genes offer:
- molecular markers readily used in marker-assisted selection
- targets for genomic selection, gene editing, or introgression, aimed at combining yield, stress tolerance, and nutritional quality in new cultivars
Key points for crop researchers and breeders
- Combining population evolution analysis and GWAS gives both historical context and genetic detail for complex traits.
- In carrot, resequencing 630 accessions exposed domestication bottlenecks, population structure, and strong selection signals.
- GWAS for root traits and carotenoid content highlighted major loci on chromosomes 2, 3, and 7 and candidate genes such as DCAR_730022 and DCAR_310369.
- These insights translate into practical tools for breeding higher-quality, nutrient-rich carrot varieties.
FAQs
Q1. What is population evolution analysis in crop genomics?
Population evolution analysis uses genome-wide variants from many accessions to infer population structure, diversity, linkage disequilibrium, gene flow, and demographic history. In crops, it helps reconstruct domestication routes, breeding steps, and genomic regions shaped by selection.
Q2. Why is it useful to combine population evolution with GWAS?
Population analyses show where selection acted and how populations changed. GWAS then links those regions to measurable traits. Together, they increase confidence in candidate genes and help separate causal loci from nearby passengers.
Q3. How many samples and how much sequencing depth do I need?
For most crops, panels with at least 200 accessions and roughly 10× whole-genome coverage per sample provide a sound starting point. Bigger panels and slightly higher depth improve power for rare variants and fine mapping, but costs also increase.
How CD Genomics supports population evolution and GWAS projects
If you are planning a project similar to the carrot example—combining population evolution analysis with GWAS—CD Genomics can support you from study design to data interpretation through our Population Genetics Sequencing & Bioinformatics Services.
1. Sequencing options tailored to population genomics
We provide multiple sequencing platforms that can be matched to different species, genome sizes, and budgets:
- Whole-genome resequencing and whole-exome sequencing for comprehensive variant discovery
- Reduced-representation approaches such as ddRAD-seq, 2b-RAD, RAD-seq, and GBS for cost-effective SNP genotyping in large panels
- Targeted panels and methylation assays for focusing on specific genomic regions or epigenetic marks in populations
- Specialised services for pan-genome projects, ancient DNA, and other challenging population genomics applications
These solutions make it easier to assemble high-quality variant datasets suitable for both population evolution analyses and GWAS.
2. Population genetics and GWAS bioinformatics
Our bioinformatics team offers modular pipelines that can be combined into a full workflow:
- Population structure and diversity
- PCA, admixture modelling, phylogenetic trees, genetic differentiation, and runs of homozygosity
- Linkage disequilibrium, gene flow, and demographic history
- LD decay analysis, F_ST estimation, migration modelling, and demographic reconstructions
- Association and mapping analysis
- GWAS for quantitative and qualitative traits using appropriate mixed models
- QTL mapping in family-based populations
- BSA/BSA-seq for rapid identification of major loci in segregating populations
Where needed, we can also integrate transcriptomics, metabolomics, or other omics layers to support integrative trait mapping.
3. Study design advice and reporting
For each collaboration, we help you:
- refine panel composition, sample size, and sequencing depth according to your species and trait goals
- select the most suitable combination of sequencing platform and analysis modules
- receive publication-ready reports, including population structure plots, LD decay curves, demographic histories, and Manhattan/QQ plots
All CD Genomics solutions are provided for research use only (RUO) and are not intended for clinical diagnosis or individual medical decision-making.
If you are considering a population evolution or GWAS project, you can visit our Population Genetics Sequencing & Bioinformatics Services page to request a quote or discuss a customised study design with our team.
References
- Coe, K.M., Bostan, H., Rolling, W. et al. Population genomics identifies genetic signatures of carrot domestication and improvement and uncovers the origin of high-carotenoid orange carrots. Nature Plants 9, 1643–1658 (2023).
- Guo, Y. & Lu, F. The changing colour of carrot. Nature Plants 9, 1583–1584 (2023).
- Zhao, Y., Feng, M., Paudel, D. et al. Advances in genomics approaches shed light on crop domestication. Plants 10, 1571 (2021).
- Mir, R.R., Reynolds, M., Pinto, F. et al. High-throughput phenotyping for crop improvement in the genomics era. Plant Sci. 282, 60–72 (2019).
- Clauw, P., Ellis, T.J., Liu, H.-J. & Sasaki, E. Beyond the standard GWAS—A guide for plant biologists. Plant Cell Physiol. 66, 431–443 (2025).
- Yuan, X., Jiang, X., Zhang, M. et al. Integrative omics analysis elucidates the genetic basis underlying seed weight and oil content in soybean. Plant Cell 36, 2160–2175 (2024).
* Designed for biological research and industrial applications, not intended
for individual clinical or medical purposes.