How to Read a Genotyping Array QC Report: Call Rate, Missingness, Heterozygosity, and Concordance

Q: What Is an Acceptable Sample Call Rate for a Crop or Livestock SNP Array Project?

There isn't one universal call rate threshold that is acceptable across species, panels, and downstream goals. QC teams generally aim high because low call rates reduce power and increase rework, but the right decision depends on sample type variability, panel transferability to your germplasm, and whether your downstream workflow tolerates exclusions. If your project depends on cross-batch merging and long-term comparability, you should be more conservative because small losses in one cohort can become larger losses after merging. Use thresholds as policy, but interpret outcomes through distributions and batch patterns.

Q: Should Concordance Be Computed Before or After QC Filtering?

Ask the report to state exactly when concordance was computed. Pre-filter concordance can reveal raw instability (including no-call behavior), while post-filter concordance can show repeatability among retained genotypes. For acceptance decisions, it's often most defensible to review both (or at least require a clearly defined approach) so you can explain what was compared and why.

Scientific dashboard-style cover image of a genotyping array QC report overview.

Genotyping array projects rarely fail in dramatic ways. They fail in quiet, expensive ways: a weak plate that drags down a whole batch, a handful of mixed samples that look "plausible" until they break a merge, or a replicate set that doesn't agree.

If you own QC for breeding genotyping, you're not asking, "Did we get genotype calls?" You're asking a decision question:

Can this dataset be trusted—and can we approve it for downstream use without paying for surprises later?

This guide is written for that sign‑off moment. It explains how to read four high-signal sections you'll see in almost every genotyping array QC report—call rate, missingness, heterozygosity, and concordance—and how to translate them into acceptance actions: approve, exclude, rerun, or escalate for clarification.

Key Takeaway: Don't approve "data delivered." Approve "data you can explain, reuse, and merge."

TL;DR (Sign-off actions)

Approve when call rate/missingness look strong and batch distributions are stable and replicate concordance supports repeatability.
Exclude when failures are sparse, well-explained, and documented with sample/marker IDs and reasons.
Rerun when failures are likely fixable (e.g., sample quality) and the sample is decision-critical.
Escalate for clarification when methods/thresholds are undocumented, batch exceptions exist, or concordance outliers raise identity/contamination concerns.

Why a Genotyping Array QC Report Matters Before You Trust the Data

A genotyping array QC report is the fastest way to judge whether delivered genotype data are usable, repeatable, and safe to move into downstream breeding workflows.

Why "Data Delivered" Does Not Mean "Data Ready"

Downstream pipelines amplify QC weaknesses. A small amount of instability at the calling stage can turn into weeks of cleanup when you try to merge cohorts, run population analysis, prepare GWAS inputs, or set baselines for genomic selection.

So the first job of the QC report is not to look impressive. It's to make an acceptance decision defensible.

What a QC-Focused Reader Needs to Know First

A practical QC reader needs the report to answer four acceptance questions:

Usability: Are most samples and markers strong enough to use?
Repeatability: Do duplicates/replicates agree?
Mergeability: Can this cohort be combined with prior/future batches without hidden incompatibilities?
Traceability: Can you list what was excluded, why, and how thresholds were applied?

How QC Reports Reduce Risk Before Downstream Analysis

A good report reduces risk by turning performance into documented decisions:

what passed
what failed
what was rerun
what was excluded
what caveats follow the dataset into downstream work

That transparency matters because QC is not only "filtering." It's defining what the dataset means and what it can safely be used for.

Industry and international characterization guidance consistently emphasize that high-quality, standardized genotype data are most valuable when they remain reusable, interpretable, and comparable across cohorts rather than functioning as isolated single-batch outputs.

What a Good QC Report Should Show at a Glance

A useful QC report should let a project owner quickly see sample pass rates, marker performance, repeatability signals, exclusions, and any caveats that affect downstream use.

The Minimum Elements a Practical QC Report Should Include

A QC report that is actually acceptance-ready usually includes:

Project summary (array/panel version, batches/plates)
Counts (received, processed, passed, failed, rerun)
Sample-level QC (call rate/missingness, heterozygosity, outliers)
Marker-level QC (marker missingness/call rate, failed/weak markers)
Replicate/duplicate checks (concordance + any outliers)
Exclusions (sample IDs / marker IDs + reasons)
Methods notes (thresholds and how they were applied)

If exclusions and methods notes are missing, you can still read the report—but you cannot reliably reuse the data later.

Sample-Level Metrics vs Marker-Level Metrics

Keep these separate when you interpret.

Sample-level metrics: "Is this sample's genotype profile stable enough to trust?"
Marker-level metrics: "Does this marker behave well enough in this cohort to keep?"

Sample-level QC protects your cohort integrity. Marker-level QC protects your downstream resolution and comparability.

Why Exclusion Notes and Methods Notes Matter

Acceptance disputes usually come from missing logic, not missing numbers.

If you can't answer "Which samples were removed and why?" or "Was that threshold applied before or after other filters?" the dataset becomes hard to hand off and harder to merge.

Mock QC report overview dashboard showing sample intake, pass/fail, call rate, missingness, heterozygosity outlier flags, concordance, exclusions, and methods notes.

If you're evaluating deliverables from a service provider, this "at-a-glance" clarity is what you should expect from livestock SNP array services with clear QC reporting and crop genotyping array services with analysis-ready outputs.

How to Interpret Call Rate Without Overreacting to One Number

Quick Reference: Red Flags → Recommended Actions

Red flag pattern in the QC report	Why it matters	Recommended action
An entire plate/batch shifts down in call rate (not just a few outliers)	Suggests a systematic process drift; downstream merges may inherit a batch effect	Quarantine that batch; request methods notes; align on rerun/exclusion before approving merges
Missingness increases broadly and is batch-structured	Indicates widespread instability, not localized sample failure	Investigate batch conditions; consider rerun policy; avoid "filter-only" acceptance
High missingness + high heterozygosity outliers cluster together	Often reflects unstable calling or mixed material/contamination signals	Verify sample identity/handling; prioritize reruns or exclusions with documentation
Concordance drops even when mean call rate looks good	Repeatability is compromised; mergeability and long-term comparability at risk	Treat as acceptance-critical; inspect replicate pairs; escalate for identity/mix-up checks
A marker subset underperforms across many samples	Can reduce resolution and complicate cross-cohort comparability	Filter markers transparently; ship marker QC table + exclusion reasons

Use this table as a sign-off shortcut: it maps patterns (not single averages) to defensible acceptance actions.

Methods & Threshold Notes (How to Make QC Decisions Audit-Ready)

QC reports are only "sign-off ready" when you can explain how metrics were computed, when thresholds were applied, and what changed between batches.

What to Request (or Document) in the Methods Notes

At minimum, the report should state:

Calling software and version (and any key settings that affect clustering/calling)
When filtering happened (pre-calling vs post-calling; before vs after sample/marker exclusions)
Threshold policy (default thresholds and any batch-specific exceptions)
How outliers were defined (e.g., heterozygosity outliers by distribution-based rules rather than a universal fixed cut-off)
How concordance was computed (which genotypes counted, and whether "no-calls" were excluded)

A Practical Way to Set Thresholds Without Using One Universal Number

Instead of treating thresholds as absolute, define them as a documented policy tied to downstream risk:

Start from downstream intent (merge across years? GWAS? genomic selection baselines?)
Check cohort distributions and batch structure (avoid approving a shifted plate just because the overall mean is high)
Use outlier logic for sample behavior signals (heterozygosity is most actionable as an outlier detector)
Write the decision rules down (approve/exclude/rerun/escalate) so the same logic can be repeated in the next batch

If the report cannot provide these notes, you can still interpret the numbers—but your sign-off becomes harder to defend later.

Call rate is useful only when it is read in context, because one failed sample means something very different from a weak batch or a broadly unstable run.

What Sample Call Rate Actually Tells You

Sample call rate is the fraction of markers that received a genotype call for a given sample.

In QC terms, it's a condensed signal of whether the sample produced an interpretable profile under the project's calling conditions.

What to look for in the report:

distribution (not only mean)
outliers (how many, how extreme)
whether failures cluster by batch/plate

What Marker Call Rate Adds to the Picture

Marker call rate asks a different question: "Does this marker behave consistently across the cohort?"

Marker call rate helps you separate:

a few weak samples (local)
a weak marker subset (panel transferability, clustering ambiguity)
a systematic run problem (batch)

When Low Call Rate Points to Sample Problems

A typical "local" pattern is a tight high-performing cloud plus a small tail of failures.

In that case, your acceptance decision is usually: approve the dataset with exclusions (and possibly reruns for high-value samples), while preserving the failure list and reasons for auditability.

When Low Call Rate Suggests a Workflow or Batch Issue

A typical "systemic" pattern is a whole batch shifting down (or widening) relative to other batches.

That's a different class of risk: your downstream pipeline may inherit a batch effect that filtering does not fully remove.

Sample call rate distribution across batches showing one batch with lower call rates and a few low-quality outliers.

For large cohorts—like projects that require strict quality-control gates in large maize cohorts—batch plots are more decision‑relevant than a single average.

Platform-level genotyping workflows commonly begin with preliminary sample quality review, because failed or suboptimal samples can affect clustering, call performance, and the reliability of downstream interpretation.

How Missingness Helps You Spot Weak Samples, Weak Markers, and Weak Batches

Missingness becomes actionable when it shows where data gaps come from, whether that is a few poor samples, a subset of weak markers, or a broader batch problem.

Sample Missingness vs Marker Missingness

Missingness is "data gaps." You'll usually see it reported either directly as missingness or indirectly as (1 − call rate).

Sample missingness: gaps concentrated in specific samples
Marker missingness: gaps concentrated in specific markers

The report should make both visible, because they imply different remediation steps.

Why Missingness Patterns Matter More Than a Single Average

Average missingness can look acceptable while hiding a problematic shape.

Actionable patterns to distinguish:

few bad samples (exclude/rerun)
few bad markers (filter with documentation)
one problematic batch (investigate process drift)
project-wide weakness (question overall reliability)

How Missingness Affects Downstream Merging and Analysis

Missingness impacts day‑to‑day downstream work:

how many samples survive after standard filters
how many markers remain comparable across cohorts
whether merges inflate missingness unevenly across batches
whether GWAS/GS inputs become unstable or biased

Missingness is not an abstract QC statistic; it's a predictor of downstream friction.

When Missing Data Can Be Managed and When It Signals a QC Issue

Missingness is manageable when it is localized and your post-filtering dataset still meets the intended analysis goal.

Missingness is a QC issue when it is batch-structured, widespread, or coupled with other anomaly signals (especially concordance drops).

If you work in complex genomes where certain regions are inherently harder to call, evaluate whether the report documents marker behavior and filtering logic clearly. This kind of transparency is central to scopes like high-precision genotyping in complex wheat genomes and filtering homologous interference in complex crop genotyping.

How to Read Heterozygosity Without Confusing Biology and Noise

Heterozygosity is most useful as an outlier signal, because unusual values can point to contamination, unusual sample composition, or technical artifacts rather than normal project variation.

What Heterozygosity Means in Practical QC Terms

In a QC report, heterozygosity is a sample‑level "behavior signal." It's not a single threshold that applies across all breeding designs.

Use it to find samples that behave unlike the cohort.

Why Outliers Matter More Than the Mean

The cohort mean can shift because of real population structure, breeding design, or panel characteristics.

So focus on outliers—especially those that also show high missingness.

Common Reasons for High or Low Heterozygosity

Common QC interpretations (without over‑diagnosing):

high heterozygosity: possible mixed material/contamination or unstable calling
low heterozygosity: possible highly inbred material, unusual sample profile, or depressed heterozygous calls due to artifacts

The report should flag outliers and show enough context (distribution + batch structure) for you to decide what to review.

Why Population Context Still Matters

If your cohort spans multiple breeding populations, legitimate heterozygosity structure is expected.

That's why the best reports either show group‑wise distributions or clearly separate outlier logic from population differences. A cross‑population framing like rice genotyping across diverse breeding populations is typically more useful than a single global mean.

Scatter plot of sample missingness vs heterozygosity showing main cloud and labeled outliers for possible contamination, low-quality DNA, or unusual sample profile.

Why Concordance Is One of the Strongest Signals of Trustworthy Deliverables

Concordance shows whether repeated measurements agree, making it one of the clearest indicators that genotyping results are reproducible rather than merely high-throughput.

What Concordance Means in a Vendor QC Context

Concordance is the percent agreement between genotype calls across duplicates, replicates, or repeat runs.

It answers: If we genotype the same sample twice, do we get the same result?

For acceptance, concordance is powerful because it tests reproducibility directly.

Replicates, Duplicates, and Repeat Runs

Before you interpret concordance, confirm the report states:

what counts as a replicate/duplicate in this project
how many replicate pairs exist
how concordance was computed (and after what filters)
whether any pairs are flagged and what follow‑up policy applies

What Low Concordance Usually Signals

Low concordance often indicates a failure mode that affects trust, not only completeness:

sample identity issues (swap/mislabelling)
contamination/mixing
calling instability
batch-specific drift

Even if average call rate looks strong, poor concordance can make the dataset risky to merge or reuse.

Why Concordance Matters for Long-Term and Multi-Batch Projects

Breeding programs are almost always multi‑batch. If your measurement process is not repeatable, your "long‑term cohort" becomes a set of incompatible snapshots.

That's why repeatability signals are especially relevant for long‑run deliverables like repeatable sheep SNP genotypes across cohorts.

In agrigenomics workflows, reproducibility is treated as a core performance expectation rather than a secondary metric, which is why concordance remains one of the most practical signals that deliverables are trustworthy enough for reuse and long-term cohort comparison.

Dashboard-style concordance summary showing replicate pairs with one flagged low-concordance pair and a short interpretation box.

How to Tell Whether a QC Problem Is Local, Fixable, or Project-Wide

The most practical QC skill is distinguishing isolated sample problems from systematic project problems, because that determines whether you should remove a few samples, review one batch, or question the full dataset.

Single-Sample Outliers

Local failures are common and often manageable.

Typical signature: a small number of samples are extreme outliers for call rate/missingness, and batch distributions remain stable.

Practical next step: validate sample identity and metadata, rerun if warranted, otherwise exclude with clear documentation.

Plate- or Batch-Level Signals

Batch‑level signals are acceptance‑critical.

Typical signature: an entire plate/batch shifts, variance inflates, and replicate concordance may drop in the same batch.

Practical next step: quarantine that batch, request methods notes, and align on rerun/exclusion policy before approving downstream merges.

Marker-Specific Underperformance

Marker weaknesses can be technical or biological.

Typical signature: the same marker subset underperforms across many samples.

Practical next step: filter with transparency (what was removed, what remained, and how that impacts downstream resolution). This is also where "analysis‑ready" packaging matters. If your downstream team expects consistent QC fields in the final outputs, align deliverable structure early—especially when the workflow expects analysis-ready VCF deliverables for tomato breeding.

When to Ask for Reruns, Exclusions, or Clarified Methods

A defensible QC escalation decision usually follows this logic:

Rerun when the problem is likely fixable and the sample is decision‑critical.
Exclude when failures are sparse and well explained.
Clarify methods when acceptance depends on undocumented thresholds, opaque calling decisions, or batch exceptions.
Question the dataset when multiple independent signals align (batch‑structured call rate drop + broad missingness + outlier clusters + weak concordance).

A Practical Acceptance Checklist Before You Approve the Deliverables

A genotyping project should be signed off only when the QC report, exclusions, file structure, and methods notes make the dataset understandable, reusable, and fit for the intended downstream workflow.

Questions to Ask Before Signing Off

Can you answer these with evidence from the report?

Are pass/fail criteria stated clearly?
Are exclusions documented with sample IDs/marker IDs and reasons?
Are batch/plate distributions shown (not only averages)?
Is concordance reported for duplicates/replicates, and are outliers explained?
Are QC metrics delivered alongside genotype files so downstream filtering is reproducible?

What Files Should Arrive with the QC Report

Suggested QC Tables (Example Fields)

To keep filtering reproducible downstream, ask for QC tables that travel with the genotype calls.

sample_qc.tsv: sample_id, batch_id/plate_id, call_rate, missingness, heterozygosity, pass_fail, notes
marker_qc.tsv: marker_id, call_rate, missingness, flag, notes
replicate_pairs.tsv (if applicable): sample_id_1, sample_id_2, replicate_type, concordance, n_compared, flag, notes

Even if your downstream team applies different cutoffs later, these fields preserve traceability and make merges auditable.

A practical deliverable package typically includes:

genotype calls in the agreed format
a sample QC table (IDs, batch, key QC fields, pass/fail)
a marker QC table (missingness/call rate, flags)
exclusions list
methods/parameter notes

If these are missing, the dataset becomes harder to hand off and harder to defend later.

Which Caveats Must Be Documented

Insist on explicit notes when:

thresholds differ by batch
calling/clustering parameters changed
reruns occurred and replaced results
panel annotation/version differences affect long‑term merging

What "Analysis-Ready" Should Mean in Practice

"Analysis‑ready" means downstream teams can merge and filter without reverse‑engineering how QC was done.

If a dataset is "analysis‑ready," it is also "audit‑ready."

What to Ask a Genotyping Vendor Before the Project Starts

The easiest way to avoid QC disputes later is to align on pass criteria, rerun policy, file structure, and repeatability expectations before samples are processed.

QC Thresholds and Exclusion Logic

Align up front on how call rate/missingness are calculated, how heterozygosity outliers are defined, and how marker filtering will be documented.

Sample Failure and Rerun Policy

Agree on what happens when samples fail: rerun criteria, how reruns are represented, and what gets excluded when reruns are not possible.

Deliverable Format and Metadata Alignment

Align on file formats and metadata stability: sample IDs, batch IDs, marker IDs, and annotation versions.

If you're outsourcing array genotyping as a service (RUO), this alignment is what prevents downstream pipeline friction—especially for teams expecting documented outputs like bovine SNP array outputs with QC reporting and documented porcine genotype deliverables.

Cross-Batch Consistency and Repeatability Documentation

Ask directly whether the QC report will include batch‑level plots, replicate concordance summaries, and methods notes for any batch exceptions.

FAQ

Q1: What Does Low Call Rate Mean in a Genotyping Array QC Report?
A: Low call rate means the dataset contains a mix of strong and weak genotype profiles, but it only becomes interpretable when you see the pattern. If a small number of samples are low while most samples cluster tightly at high call rate, you're usually looking at local sample problems and the project can often be accepted with exclusions or reruns. If an entire plate or batch shifts downward, low call rate is more likely a workflow or batch issue, and the risk is systematic bias that filtering may not fix. Treat low call rate as a trigger to check distribution and batch structure, not as a single-number verdict.

Q2: What Is an Acceptable Sample Call Rate for a Crop or Livestock SNP Array Project?
A: There isn't one universal call rate threshold that is "acceptable" across species, panels, and downstream goals. QC teams generally aim high because low call rates reduce power and increase rework, but the right decision depends on sample type variability, panel transferability to your germplasm, and whether your downstream workflow tolerates exclusions. If your project depends on cross-batch merging and long-term comparability, you should be more conservative because small losses in one cohort can become larger losses after merging. Use thresholds as policy, but interpret outcomes through distributions and batch patterns.

Q3: How Should Missingness and Heterozygosity Be Read Together?
A: Missingness shows where the data gaps are; heterozygosity helps you judge whether an unusual sample is behaving like biology or like noise. High missingness paired with high heterozygosity often points to unstable calling or mixed material where genotype assignment is inconsistent. High missingness with otherwise normal heterozygosity can look more like weak DNA or assay performance without clear mixing. Low heterozygosity with normal missingness can be legitimate in inbred materials, but extreme low values still warrant identity and metadata review. Read them together because the combination is more diagnostic than either metric alone.

Q4: Why Does Concordance Matter if the Average Call Rate Already Looks Good?
A: Average call rate can look excellent even when repeatability is compromised, because call rate measures how much got called, not whether the same thing would be called the same way twice. Concordance tests reproducibility directly by comparing duplicates, replicates, or repeat runs. If concordance is weak for any subset, it raises the possibility of sample identity problems, contamination, or batch-specific calling instability, all of which can damage downstream merges and long-term breeding decisions. For multi-batch and multi-year programs, concordance is one of the strongest signals that deliverables are trustworthy.

Q5: What Should I Ask a Vendor Before Accepting Genotyping Deliverables?
A: Ask for acceptance logic, not only summary numbers. You need clear pass/fail criteria, explicit exclusion lists with reasons, and methods notes that describe how thresholds were applied. Confirm that sample IDs and batch IDs are stable and traceable, and that QC fields ship with the final genotype files so downstream teams can reproduce filtering. Finally, ask for repeatability evidence: how many duplicates or replicates exist, what concordance looks like, and what rerun policy applies when a repeat fails. Those items prevent most downstream disputes.

Q6: How Do I Keep Genotyping Array Data Mergeable Across Batches or Array Versions?
A: Mergeability depends on consistent identifiers and consistent QC documentation. Confirm stable sample IDs and batch IDs, keep a documented list of excluded samples/markers, and record array annotation/version (and any changes) in the methods notes. When versions differ, plan a compatibility strategy (common marker intersection, harmonized strand/allele coding, and consistent filtering rules) before approving long-term merges.

Q7: Should Concordance Be Computed Before or After QC Filtering?
A: Ask the report to state exactly when concordance was computed. Pre-filter concordance can reveal raw instability (including no-call behavior), while post-filter concordance can show repeatability among retained genotypes. For acceptance decisions, it's often most defensible to review both (or at least require a clearly defined approach) so you can explain what was compared and why.

Prepared by the CD Genomics Agrigenomics Team

Reviewed for technical accuracy in breeding genotyping QC and downstream data delivery.

The guidance in this article is based on common decision points in array-based genotyping QC review, together with published resources on genotype QC, concordance assessment, and reproducible downstream data handling.

References

Guo, M., et al. "Quality and concordance of genotyping array data of TCGA samples processed in two centers." Genomics, Proteomics & Bioinformatics, 2018.
Marees, A. T., et al. "A tutorial on conducting genome‐wide association studies: Quality control and statistical analysis." International Journal of Methods in Psychiatric Research, 2018.
Ritchie, M. D., et al. "Strategies for processing and quality control of Illumina genotyping arrays." Briefings in Bioinformatics, 2018.
Zhao, S., et al. "GTQC: Automated Genotyping Array Quality Control and Report." Bioinformatics Advances, 2022.

For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.

Send a Message

For any general inquiries, please fill out the form below.