How to Interpret STR Analysis Reports: A Guide for QA and Research Teams

You've received the PDF. Peaks, tables, a tidy "match %," and a comparison summary. Now what? This guide shows QA managers and research teams how to move from raw STR data to a defensible decision about identity, traceability, and risk—consistent with ANSI/ATCC ASN‑0002‑2022 and aligned with ATCC and ICLAC guidance.

What an STR report usually contains

  • Electropherogram(s): channel-by-channel capillary electrophoresis traces with labeled peaks.
  • Allele table: a locus-by-locus list of allele calls (and often peak heights/quality flags).
  • Similarity or match metric: a percentage derived from allele concordance across loci.
  • Database/reference comparison summary: notes against ATCC/DSMZ/JCRB or an internal reference.

Disclosure: CD Genomics is our product. Where relevant, we link to public service and resource pages to illustrate standard research-use workflows without promotion.


Quick-Start Workflow for QA: What to Check First

Before deep-diving into every peak, make three rapid checks to orient the review and avoid rework.

1) Read the conclusion and context

  • Interpretation category: Identify the stated category (e.g., match/consistent, partial/indeterminate, mismatch/exclusion). Under ASN‑0002, decisions are made by allele concordance across shared loci and documented comparison rules, not by the "match %" alone. See ATCC's description of the standard and database comparison process in their service documentation for context.
  • Reference profile: Confirm the exact reference used (internal banked profile, an ATCC/DSMZ record, or a historical in‑house profile). If the report compares to a specific lot or bank record, that should be named.
  • IDs and traceability: Verify sample identifiers, chain-of-custody fields, extraction date, and any notes about passage number or culture history. Those details become critical if you later defend a decision in a manuscript or grant package.

2) Verify the report includes the core evidence

  • Electropherogram(s): Raw and/or annotated plots for each dye channel.
  • Allele table: Locus names and allele calls; ideally with peak heights (RFU), flags, and comments.
  • Match calculation method and thresholds used: The method (e.g., similarity based on shared alleles) and any rule statements applied in the interpretation.
  • Database/reference comparison notes: Which database/tool was used (e.g., ATCC's public search) and how the result was interpreted.

3) Decide next actions based on clarity of evidence

  • Clear and internally consistent profile: If the allele table, plots, and comparison align with a standard interpretation rule, archive the package with a short interpretation note.
  • Unclear or conflicting signals: If mixture indicators, locus dropout, or dye artifacts complicate the call, proceed to the troubleshooting sections below and consider a retest or additional verification (e.g., repeat extraction). For ordering or logistics of a retest, see your internal SOP or check the provider's submission instructions. For example, CD Genomics provides research‑use sample requirements and submission steps on its sample submission guidelines (PDF) and Ordering pages.

The Anatomy of an Electropherogram: How to Read the Signal

Capillary electrophoresis (CE) traces are where decisions often begin. Think of each channel as a color-coded lane that separates amplified fragments by size. The x‑axis is fragment size in base pairs (bp); the y‑axis is fluorescence or peak height in relative fluorescence units (RFU). Internal size standards and allelic ladders anchor the calling software's sizing and allele assignments.

Axes and what you're looking at

  • X‑axis (bp): A calibrated scale built from an internal size standard; alleles are defined by repeat count that maps to size windows.
  • Y‑axis (RFU): Peak height/area proportional to signal strength. Labs establish analytical and stochastic thresholds based on negative controls and validation runs, which they document in their SOPs.

True alleles vs. stutter peaks vs. artifacts

  • True allele: Usually the dominant, well-shaped peak in an expected size bin for the locus.
  • Stutter: A smaller peak—commonly one repeat unit shorter than the main allele—caused by polymerase slippage during PCR. In mixtures, stutter can be confused with low-level second contributors; interpretation needs caution and, when necessary, repeat runs or alternative checks. Guidance on mixtures highlights how stutter complicates interpretation in multi-source profiles, as discussed in the 2024 supplemental document from NIST.
  • Artifacts/noise: Baseline spikes or spectral issues (pull‑up/bleed‑through) from very strong peaks or suboptimal spectral calibration. Manufacturer technical notes recommend dilution and recalibration when pull‑up is suspected and stress setting analytical thresholds empirically to separate noise from signal.

Recognizing mixture or contamination patterns

Consider "mixture" when you see:

  • More than two allele peaks across multiple loci (beyond biological cases like aneuploidy in tumor lines).
  • Inconsistent peak height ratios across loci that don't track with expected heterozygote balance.
  • A recurring set of unexpected peaks across the profile that cannot be explained by stutter alone.

CE precision matters for interpretation

High-quality data relies on precision Capillary Electrophoresis technology.


Reading the Allele Table for STR Profile Interpretation: Locus-by-Locus

The allele table is the structured counterpart to the plots. It is where concordance is counted.

What each field typically means

  • Locus name: The STR marker (e.g., D5S818, D13S317) defined in the human authentication panel; ASN‑0002 describes the 13 core autosomal loci plus Amelogenin used in standard practice.
  • Allele calls: Integer (and sometimes microvariant) calls that represent repeat counts at each locus—the core of how to read an STR DNA profile.
  • Optional support fields: Peak heights (RFU), flags (off‑ladder, OL; microvariant, e.g., 14.2), and free‑text comments.

Common patterns to watch for

  • Missing loci/allele dropout: Weak or absent peaks due to low template, inhibition, or degradation. Cross‑check RFU values against your lab's analytical/stochastic thresholds and consider a repeat with adjusted input.
  • Multiple alleles at a locus: Potential mixture. Examine whether extra peaks align with known stutter positions; if not, investigate contamination or cross‑mixing.
  • Systematic shifts across several loci: Could reflect genetic instability (e.g., MSI) or technical sizing issues. Compare to earlier in‑house references if available and review instrument QC.

Handling microvariants and off‑ladder (OL) calls

Microvariants (e.g., 14.2) are alleles that fall between the main ladder bins by a fraction of a repeat. Off‑ladder calls are peaks that do not match an allelic ladder designation. In either case:

  • Verify sizing precision against the internal size standard and, if needed, re‑run to confirm reproducibility.
  • Consult authoritative resources on microvariants and OL handling; document any rule applied (e.g., treating a consistent microvariant as a valid allele with precise sizing notes).
  • When in doubt, note the observation in the interpretation memo and consider an additional confirmation run.

If instability is suspected in tumor-derived lines, consider dedicated assays for microsatellite instability. For background on research‑use MSI testing methodologies, see Microsatellite Instability Analysis.


Match Percentages and Similarity: What They Mean (and Don't)

Most reports include a similarity or "match %." It's helpful—but it isn't the final arbiter under ASN‑0002. Here's how to read it responsibly.

The Tanabe algorithm (simplified)

ATCC's public STR search tools describe similarity calculations that compare allele concordance across shared loci and output a percentage. The widely referenced Tanabe similarity treats each locus as a set of alleles and scores overlap across the profile. The exact implementation can differ between tools; ATCC's tutorial also mentions filtering results by similarity bands (e.g., ≥80%). Use the tool's documentation to understand how partial profiles and microvariants are handled.

A worked mini‑example (hypothetical)

Imagine your sample and reference share 12 loci. At 10 loci, all alleles match exactly; at 2 loci, one allele differs by one repeat while the other matches. A simple set‑based similarity would credit each locus for overlapping alleles and divide by the total alleles across shared loci to produce a percentage. If that yields ~92%, your initial reading may be "generally consistent." But you'd still ask: Are those discordant loci known to drift in this line? Do the plots show clean peaks with no OL flags? Did you analyze enough loci? The decision rests on the concordance evidence and documented rule, not the percentage alone.

Typical ranges in practice (as an interpretation aid)

  • 100%: Strongly consistent with the reference profile used—then verify the underlying evidence (full loci coverage, clean plots).
  • Around 90%: Generally consistent; inspect discordant loci, off‑ladder notes, and whether instability could explain differences.
  • Below ~80%: Often treated as inconsistent or requiring deeper investigation depending on loci coverage and the quality of the reference.

These practical bands are discussed in the literature on match criteria for human cell line authentication; for example, Capes‑Davis and colleagues analyzed large profile sets to propose where to "draw the line" for relatedness. Treat such bands as evidence‑based guidance, not as a replacement for your lab's documented interpretation rules under ASN‑0002.

Important caution

Similarity is a helpful triage metric—not standalone proof. Under ANSI/ATCC ASN‑0002‑2022, the defensible decision comes from allele concordance across shared loci, loci completeness, and the quality of electropherograms, with transparent documentation of the rules used. Always interpret the percentage together with the allele table, plots, and context (passage history, source record). After all, what matters most is this: can you defend your STR profile interpretation to an auditor or reviewer tomorrow?


Cross‑Referencing Databases: Validating Provenance and Traceability

Why database comparison matters

  • Detects known misidentifications and common cross‑contaminations.
  • Strengthens traceability for QA archives, manuscript submissions, and grant packages (e.g., confirming that a line is not on a misidentification registry and that its profile is consistent with the origin record).

ICLAC (the International Cell Line Authentication Committee) explicitly recommends routine authentication of established human cell lines using STR profiling, checking the Registry of Misidentified Cell Lines, and documenting methods, results, and RRIDs in manuscripts and grants.

What to document for traceability

  • Database/reference source used: ATCC, DSMZ, JCRB, or an internal bank record.
  • Version/date if provided; the match method and interpretation rule used.
  • Caveats: Note if the profile is partial (e.g., due to FFPE or low input), if mixture indicators exist, or if instability is suspected.

Series interlink (databases deep dive)

Cross-reference profiles using major STR analysis databases like DSMZ and ATCC to support identity verification and traceability.

If you maintain an internal catalog or use a third‑party service for authentication, document the exact reference (catalog ID, lot, passage) you compared against. For an example of a neutral, research-use service description, see Cell Line Identification and Authentication via STR Profiling.


Troubleshooting "Messy" Peaks: Root Causes and Next Steps

When the report's evidence conflicts or feels borderline, step through likely causes. Here's a practical triage.

Biological causes: genetic instability (including MSI)

What it looks like

  • Locus-to-locus variability; allele shifts; a trend of partial discordance that accumulates over passage.

What to do next

  • Verify passage history and compare to earlier in‑house references. If your line is tumor-derived or PDX‑adapted, instability may be expected. Consider dedicated research‑use tests for MSI or repeat profiling at a future passage to monitor drift. For a methodological overview, see Microsatellite Instability Analysis.

Sample-related causes: low template, degradation, or mixed input

Indicators

  • Weak signals, allele dropout at some loci, heterozygote imbalance, and unexpected multi‑allele patterns.

Technical/analysis causes: baseline noise, pull‑up, thresholds

Indicators

  • Elevated baseline, non‑allelic spikes, dye channel overlap (pull‑up), or off‑scale peaks.

Actions

  • Review raw plots and instrument QC. Consider dilution and updated spectral calibration if pull‑up is suspected. Confirm that analytical and stochastic thresholds used in calling align with your lab's validation. As a rule of thumb, if peaks routinely hit the detector ceiling, dilute and re‑inject to prevent saturation artifacts; if negative controls show frequent spikes above your analytical threshold, revisit maintenance and recalibration schedules.


Decision Guide: When to Escalate for Expert Review

Escalate the review (e.g., to a method lead or an external specialist) when:

  • The similarity falls in an indeterminate range and the electropherogram suggests a mixture or instability.
  • Multiple loci show more than two alleles or repeated unexpected peaks.
  • The profile conflicts with expected provenance (cell bank record or historical in‑house profile).
  • You need a documented interpretation statement for a grant or manuscript evidence package (include plots, allele table, methodology, thresholds, and database comparison notes in that package).

If you are evaluating service partners for authentication as part of your QA program, you may find it useful to review decision criteria such as turnaround time, traceability practices, and batch handling capacity. See the companion guide, Choosing the Right STR Profiling Partner: A Checklist for Biotech & Pharma, for a structured approach to vendor selection within research-use contexts.

For sample handling tips that reduce the chance of repeat runs (e.g., dealing with gDNA, FFPE, or cell pellets), see Optimizing Sample Submission: From gDNA to FFPE and Cell Pellets.

Also, when method accuracy and throughput are a focus, read the technical deep dive on multiplexing and CE precision: Technical Deep Dive: Multiplex PCR and Capillary Electrophoresis Accuracy.


Conclusion: Build a Defensible STR Profile Interpretation Package

A defensible STR profile interpretation combines four pillars:

  • Allele concordance across shared loci under a documented interpretation rule (ANSI/ATCC ASN‑0002‑2022 framework).
  • Completeness and quality of the electropherogram evidence (clear peaks, validated thresholds, clean sizing).
  • A similarity/match metric used appropriately as supporting evidence—never as the sole decision maker.
  • Database cross‑checks and provenance documentation (source record, registry checks, and comparison method/notes).

Archive a complete "DNA profile STR" evidence pack: the PDF report, raw/annotated CE plots, allele table, database comparison summary, and your interpretation note referencing the rule you applied. If you suspect instability, mixtures, or technical artifacts, document your rationale and the follow‑up actions (re‑extraction/re‑run, MSI testing, or escalation review). That record will serve you in audits, manuscript submission, and internal reproducibility reviews.


Related Services


Author

Yang H. — Senior Scientist, CD Genomics; University of Florida.

Yang is a genomics researcher with over 10 years of research experience in genetics, molecular and cellular biology, sequencing workflows, and bioinformatic analysis. Skilled in both laboratory techniques and data interpretation, Yang supports RUO study design and NGS‑based projects.


References:

  1. ATCC. Human Cell STR Testing (confirms use of ANSI/ATCC ASN‑0002‑2022 and database comparison). 2026. Available at: ATCC human cell STR testing.
  2. ATCC. STR Profiling Analysis (overview of interpretation and database comparison process). Available at: ATCC STR Profiling Analysis.
  3. ATCC. ATCC STR Database Tutorial (Tanabe/Masters algorithm options; filtering by similarity). Available at: ATCC STR Database Tutorial PDF.
  4. ICLAC. Guide to Human Cell Line Authentication (STR focus and documentation guidance). 2023. Available at: ICLAC Guide.
  5. ICLAC. Cell Line Checklist for Manuscripts and Grant Applications. 2023. Available at: ICLAC Checklist.
  6. NIST. Supplemental Document to DNA Mixture Interpretation: Issues and Initiatives (notes on stutter and mixture interpretation). NIST IR 8351sup1; 2024. DOI: 10.6028/NIST.IR.8351sup1.
  7. NIJ. STR Data Analysis and Interpretation: Microvariants and Off‑Ladder Alleles (training module). Available at: NIJ training page.
  8. Capes‑Davis A, Reid YA, Kline MC, et al. Match criteria for human cell line authentication: where do we draw the line? International Journal of Cancer. 2013;132(11):2510‑2519. DOI: 10.1002/ijc.27931.
  9. Fan H, Chu JY. A brief review of short tandem repeat mutation. International Journal of Biochemistry & Cell Biology. 2007. DOI: 10.1016/j.biocel.2007.10.005.
  10. Eltonsy N, et al. Detection Algorithm for the Validation of Human Cell Lines. 2012. Available at: PubMed Central article.

Note: This article follows ANSI/ATCC ASN‑0002‑2022 and aligns with ATCC and ICLAC guidance; it does not cite any additional standards frameworks. All content is presented for research use only (RUO).

For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
PDF Download
* Email Address:

CD Genomics needs the contact information you provide to us in order to contact you about our products and services and other content that may be of interest to you. By clicking below, you consent to the storage and processing of the personal information submitted above by CD Genomcis to provide the content you have requested.

×
Quote Request
! For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
Contact CD Genomics
Terms & Conditions | Privacy Policy | Feedback   Copyright © CD Genomics. All rights reserved.
Top