For recent demography, should you trust LD-based Ne from many genomes or a PSMC curve from one deep genome? Choosing the wrong path can burn budget and mislead reviewers—especially when coverage is uneven, reference quality is mixed, or sample size is tight. This practical guide shows how to match LD Ne and PSMC to your data and time window so your effective population size estimates stand up to scrutiny. We'll also flag common artifacts and simple cross-checks you can add to your protocol.
LD-based Ne (from genome-wide linkage disequilibrium across individuals).
Best for recent Ne—on the order of tens to a few hundred generations—when you have breadth (many unrelated samples) from SNP arrays or whole-genome sequencing (WGS). LD among unlinked markers captures recent drift; with sensible minor-allele frequency (MAF) filtering and relatedness QC, you get fast, population-level estimates.
PSMC (Pairwise Sequentially Markovian Coalescent on one deep genome).
Best for older history—hundreds to thousands of generations—when you have a single, high-quality genome. PSMC recovers broad size changes over long time scales but lacks resolution very close to the present. Deep, uniform coverage and stringent masking are crucial to avoid spurious signals.
Need implementation help? Our Linkage Disequilibrium Analysis service handles LD pipelines (MAF thresholds, LD pruning, recombination maps), while Population Evolution Analysis delivers PSMC with best-practice masking and reviewer-ready figures.
Simulated demographic histories comparing classic PSMC with Beta-PSMC; Beta-PSMC resolves finer fluctuations and improves recent-epoch inference using a single deep genome (Liu J. et al. (2022) BMC Genomics).
Example of LD-based recent Ne trajectories inferred from genome-wide SNP data using SNeP in multiple sheep breeds (Barbato M. et al. (2015) Frontiers in Genetics).
Historical Ne trajectories inferred from LD (NeLD), Relate, and MSMC across generations; simulations and human data indicate LD-based estimates remain largely unaffected by selection (Novo I. et al. (2022) PLOS Genetics).
Use these real-world patterns to choose quickly, then tailor parameters.
Go with LD-based Ne. You'll estimate recent effective population size with confidence intervals in a fraction of the time a deep-genome pipeline takes. Mind your MAF filter, relatedness pruning, and cross-population callability. If your project relies on arrays or RAD-seq in non-model species, LD-based Ne still performs well with careful SNP filtering.
Choose PSMC to outline long-term size changes. Be explicit about masks, mutation rate, and generation time, and avoid over-interpreting the youngest time bins.
Adopt a hybrid plan: LD-based Ne (or IBDNe) for the last 50–200 generations and PSMC for older epochs. Compare trajectories where windows overlap; disagreement often signals structure, selection, or QC issues that merit deeper analysis.
Ancestry-specific Ne trajectories with 95% confidence bands in admixed American populations, illustrating recent bottlenecks and rebounds from IBD-based inference (Browning S.R. et al. (2018) PLOS Genetics).
If sampling spans subpopulations or admixture is suspected, complement PSMC with methods that make structure explicit (e.g., allele-frequency graphs or f-statistics) and rely on LD-based Ne for near-present size. Structured models otherwise risk creating illusory PSMC peaks.
MSMC-IM extends MSMC by fitting time-dependent migration, clarifying separation histories under structure (Wang K. et al. (2020) PLOS Genetics).
Not sure which lane you're in? Share your cohort specs and we'll provide a Ne feasibility note through our Linkage Disequilibrium Analysis or a broader plan via Population Evolution Analysis.
Sampling & relatedness (LD-based Ne).
Aim for unrelated individuals; screen with KING/IBD. Balance per-population counts so allele frequency spectra are comparable. Document how you handled duplicates, close kin, and batch effects.
Filtering choices (LD-based Ne).
Declare your MAF threshold and LD pruning strategy; these directly affect r² and thus Ne. Provide recombination map details and justify choices for your species and platform.
Coverage, masks, and parameters (PSMC).
Report depth and uniformity, masking criteria, and software versions. State mutation rate and generation time and show sensitivity to reasonable alternatives. Provide confidence intervals for Ne and annotate the time window where estimates are reliable.
Cross-validation across methods.
Where time windows overlap, compare LD-based Ne (or IBDNe) with PSMC trends. Agreement increases confidence; divergence motivates checks for structure, selection, or mapping artifacts.
Explicit limits and artifacts.
Include a brief "limits" paragraph: LD Ne's precision drops for very large Ne or sparse markers; PSMC's near-present bins are unstable and mislead under structure. Acknowledge and show the steps you took to mitigate.
Prefer turnkey execution? Pair Linkage Disequilibrium Analysis for LD metrics with Population Evolution Analysis for PSMC and cross-validation. We deliver study-ready text, figures, and a reproducibility record.
Humans (arrays/WGS; recent history).
Genome-wide LD analyses consistently show that recent human Ne is far below census size, clarifying recent expansions and population-specific dynamics. These studies demonstrate how LD-based approaches can quantify contemporary effective population size with realistic uncertainty, given adequate sample sizes and careful filtering.
One deep genome in a non-model species (deep-time history).
PSMC's first applications recovered long-term size changes from a single human genome; the same logic applies to many species where sequencing one exceptional individual is feasible. The key is rigorous masks and an honest discussion of near-present limits.
Hybrid timeline (conservation or domestication).
Use LD-based Ne (or IBDNe) to quantify the last ~50–200 generations, then stitch that to a PSMC curve for older eras. The stitched curve gives stakeholders an end-to-end narrative: recent contraction, historical stability, and ancient expansion—each supported by the method with the right time resolution.
It's an estimate of recent effective population size derived from genome-wide linkage disequilibrium across many individuals. Because LD reflects recent drift, this method is strongest in the near past (roughly tens to a few hundred generations).
PSMC models the distribution of coalescent events along one deep, diploid genome to infer long-term changes in Ne. It's powerful for older epochs but weak close to the present, where few recombination events inform the model.
Including many rare alleles can inflate r² and bias Ne; setting a modest MAF floor reduces bias with little loss of precision. Always report your MAF threshold and justify it.
Yes. Structure can inflate LD (affecting LD Ne) and can make PSMC infer false size changes under panmictic assumptions. Control with thoughtful sampling, structure-aware analyses, and method cross-checks.
Consider IBDNe, which estimates recent Ne from the length distribution of identity-by-descent segments. It complements LD-based Ne and anchors the recent end of a hybrid timeline.
Send your array or WGS specs for a Ne feasibility check. We'll confirm whether LD-based Ne, PSMC, or a hybrid plan fits your data, outline expected time resolution, and provide a reviewer-ready reporting template. Start with Linkage Disequilibrium Analysis or request an end-to-end plan via Population Evolution Analysis.
Related reading:
References