LD-Based Ne vs PSMC for Population Dynamics: When to Use Which
For recent demography, should you trust LD-based Ne from many genomes or a PSMC curve from one deep genome? Choosing the wrong path can burn budget and mislead reviewers—especially when coverage is uneven, reference quality is mixed, or sample size is tight. This practical guide shows how to match LD Ne and PSMC to your data and time window so your effective population size estimates stand up to scrutiny. We'll also flag common artifacts and simple cross-checks you can add to your protocol.
The Short Answer: Where Each Method Shines
LD-based Ne (from genome-wide linkage disequilibrium across individuals).
Best for recent Ne—on the order of tens to a few hundred generations—when you have breadth (many unrelated samples) from SNP arrays or whole-genome sequencing (WGS). LD among unlinked markers captures recent drift; with sensible minor-allele frequency (MAF) filtering and relatedness QC, you get fast, population-level estimates.
PSMC (Pairwise Sequentially Markovian Coalescent on one deep genome).
Best for older history—hundreds to thousands of generations—when you have a single, high-quality genome. PSMC recovers broad size changes over long time scales but lacks resolution very close to the present. Deep, uniform coverage and stringent masking are crucial to avoid spurious signals.
Need implementation help? Our Linkage Disequilibrium Analysis service handles LD pipelines (MAF thresholds, LD pruning, recombination maps), while Population Evolution Analysis delivers PSMC with best-practice masking and reviewer-ready figures.
Resolution vs Data Needs: A Head-to-Head Comparison
Time resolution and the questions you can answer
- LD-based Ne is tuned to recent demographic history because short-distance LD reflects recent effective size. With dense markers and an appropriate recombination map, LD-based approaches can recover trajectories over roughly 1–100+ generations, depending on species and data type. In humans, genome-wide LD has long clarified why recent effective population size is far below census numbers.
- PSMC captures deep-time changes in Ne by modeling coalescent events along a single diploid genome. It works well for mid-to-ancient epochs but is weak near the present, where few recombination breakpoints exist to inform the model.
Simulated demographic histories comparing classic PSMC with Beta-PSMC; Beta-PSMC resolves finer fluctuations and improves recent-epoch inference using a single deep genome (Liu J. et al. (2022) BMC Genomics).
Inputs, QC, and typical failure modes
- LD-based Ne
- Inputs: Many unrelated individuals (arrays or WGS). Remove close relatives; balance sampling across groups; harmonize callability across batches.
- QC levers:
- Apply a MAF floor (often 0.02–0.05) to reduce the upward bias that rare alleles can induce in r²-based estimators.
- Prune linked SNPs and use a suitable recombination map to account for residual linkage; physically linked loci can bias estimates downward if left in.
- Failure modes: Small sample sizes inflate variance; very large true Ne with sparse markers yields imprecise estimates; structure and batch effects can inflate LD if not modeled.
- PSMC
- Inputs: One deep, high-quality genome; strict masks for low complexity, repeats, and low mappability; stable mutation rate assumption and a sensible generation time.
- Failure modes: Very recent bins are unreliable; poor masks or uneven coverage can create artifacts; population structure can masquerade as size change if you assume panmixia.
Robustness to biological confounders
- tends to be largely una LD-based Ne ffected by selection when estimated from inter-marker LD across many neutral loci. Use this stability to anchor the near-present.
Example of LD-based recent Ne trajectories inferred from genome-wide SNP data using SNeP in multiple sheep breeds (Barbato M. et al. (2015) Frontiers in Genetics).
- PSMC and related SMC methods are sensitive to structure: a structured population can look like a bottleneck or expansion if structure is ignored. Plan structure-aware checks when interpreting PSMC.
Historical Ne trajectories inferred from LD (NeLD), Relate, and MSMC across generations; simulations and human data indicate LD-based estimates remain largely unaffected by selection (Novo I. et al. (2022) PLOS Genetics).
Decision Map: Pick by Scenario (and Budget)
Use these real-world patterns to choose quickly, then tailor parameters.
- Many samples, moderate coverage; need contemporary trends (human cohorts, livestock panels, tree crops).
Go with LD-based Ne. You'll estimate recent effective population size with confidence intervals in a fraction of the time a deep-genome pipeline takes. Mind your MAF filter, relatedness pruning, and cross-population callability. If your project relies on arrays or RAD-seq in non-model species, LD-based Ne still performs well with careful SNP filtering.
- Only one excellent genome in a non-model species; want a deep-time backbone.
Choose PSMC to outline long-term size changes. Be explicit about masks, mutation rate, and generation time, and avoid over-interpreting the youngest time bins.
- Need both recent and ancient views (conservation reintroductions, domestication timelines).
Adopt a hybrid plan: LD-based Ne (or IBDNe) for the last 50–200 generations and PSMC for older epochs. Compare trajectories where windows overlap; disagreement often signals structure, selection, or QC issues that merit deeper analysis.
Ancestry-specific Ne trajectories with 95% confidence bands in admixed American populations, illustrating recent bottlenecks and rebounds from IBD-based inference (Browning S.R. et al. (2018) PLOS Genetics). - Concerned about structure or admixture.
If sampling spans subpopulations or admixture is suspected, complement PSMC with methods that make structure explicit (e.g., allele-frequency graphs or f-statistics) and rely on LD-based Ne for near-present size. Structured models otherwise risk creating illusory PSMC peaks.
MSMC-IM extends MSMC by fitting time-dependent migration, clarifying separation histories under structure (Wang K. et al. (2020) PLOS Genetics).
Not sure which lane you're in? Share your cohort specs and we'll provide a Ne feasibility note through our Linkage Disequilibrium Analysis or a broader plan via Population Evolution Analysis.
Study Design & Reporting Checklist (That Reviewers Appreciate)
Sampling & relatedness (LD-based Ne).
Aim for unrelated individuals; screen with KING/IBD. Balance per-population counts so allele frequency spectra are comparable. Document how you handled duplicates, close kin, and batch effects.
Filtering choices (LD-based Ne).
Declare your MAF threshold and LD pruning strategy; these directly affect r² and thus Ne. Provide recombination map details and justify choices for your species and platform.
Coverage, masks, and parameters (PSMC).
Report depth and uniformity, masking criteria, and software versions. State mutation rate and generation time and show sensitivity to reasonable alternatives. Provide confidence intervals for Ne and annotate the time window where estimates are reliable.
Cross-validation across methods.
Where time windows overlap, compare LD-based Ne (or IBDNe) with PSMC trends. Agreement increases confidence; divergence motivates checks for structure, selection, or mapping artifacts.
Explicit limits and artifacts.
Include a brief "limits" paragraph: LD Ne's precision drops for very large Ne or sparse markers; PSMC's near-present bins are unstable and mislead under structure. Acknowledge and show the steps you took to mitigate.
Deliverables to standardize
- One-paragraph method summaries for LD-based Ne and PSMC (what it estimates, inputs, time window).
- Figure set: LD-Ne trajectory with CIs; PSMC curve with masked regions/time-bin reliability; overlapping window comparison.
- Sensitivity appendix: MAF thresholds and recombination maps (LD), mutation rate and generation time (PSMC).
Prefer turnkey execution? Pair Linkage Disequilibrium Analysis for LD metrics with Population Evolution Analysis for PSMC and cross-validation. We deliver study-ready text, figures, and a reproducibility record.
Worked Examples: What "Good" Looks Like
Humans (arrays/WGS; recent history).
Genome-wide LD analyses consistently show that recent human Ne is far below census size, clarifying recent expansions and population-specific dynamics. These studies demonstrate how LD-based approaches can quantify contemporary effective population size with realistic uncertainty, given adequate sample sizes and careful filtering.
One deep genome in a non-model species (deep-time history).
PSMC's first applications recovered long-term size changes from a single human genome; the same logic applies to many species where sequencing one exceptional individual is feasible. The key is rigorous masks and an honest discussion of near-present limits.
Hybrid timeline (conservation or domestication).
Use LD-based Ne (or IBDNe) to quantify the last ~50–200 generations, then stitch that to a PSMC curve for older eras. The stitched curve gives stakeholders an end-to-end narrative: recent contraction, historical stability, and ancient expansion—each supported by the method with the right time resolution.
Quick Answers (FAQ)
It's an estimate of recent effective population size derived from genome-wide linkage disequilibrium across many individuals. Because LD reflects recent drift, this method is strongest in the near past (roughly tens to a few hundred generations).
PSMC models the distribution of coalescent events along one deep, diploid genome to infer long-term changes in Ne. It's powerful for older epochs but weak close to the present, where few recombination events inform the model.
Including many rare alleles can inflate r² and bias Ne; setting a modest MAF floor reduces bias with little loss of precision. Always report your MAF threshold and justify it.
Yes. Structure can inflate LD (affecting LD Ne) and can make PSMC infer false size changes under panmictic assumptions. Control with thoughtful sampling, structure-aware analyses, and method cross-checks.
Consider IBDNe, which estimates recent Ne from the length distribution of identity-by-descent segments. It complements LD-based Ne and anchors the recent end of a hybrid timeline.
Ready to Proceed?
Send your array or WGS specs for a Ne feasibility check. We'll confirm whether LD-based Ne, PSMC, or a hybrid plan fits your data, outline expected time resolution, and provide a reviewer-ready reporting template. Start with Linkage Disequilibrium Analysis or request an end-to-end plan via Population Evolution Analysis.
Related reading:
- Measuring Population Dynamics: Ne, Bottlenecks & Migration
- Sampling & Batch Bias in Genomic Population Dynamics Studies
References
- Novo, Irene, Enrique Santiago, and Armando Caballero. "The estimates of effective population size based on linkage disequilibrium are virtually unaffected by natural selection." PLoS Genetics 18.1 (2022): e1009764.
- Liu, J., Ji, X. & Chen, H. Beta-PSMC: uncovering more detailed population history using beta distribution. BMC Genomics 23, 785 (2022). https://doi.org/10.1186/s12864-022-09021-6
- Browning, Sharon R., et al. "Ancestry-specific recent effective population size in the Americas." PLoS genetics 14.5 (2018): e1007385.
- Pickrell, J., Pritchard, J. Inference of population splits and mixtures from genome-wide allele frequency data. Nat Prec (2012). https://doi.org/10.1038/npre.2012.6956.1
- Marcus, Joseph, et al. "Fast and flexible estimation of effective migration surfaces." Elife 10 (2021): e61927.
- Wang, Ke, et al. "Tracking human population structure through time from whole genome sequences." PLoS genetics 16.3 (2020): e1008552.
- Barbato, Mario, et al. "SNeP: a tool to estimate trends in recent effective population size trajectories using genome-wide SNP data." Frontiers in genetics 6 (2015): 109.