Resolving Ambiguous HLA Calls: Why Phase, Heterozygosity, and Rare Alleles Matter

Four common sources of ambiguous HLA calls in cell therapy research

Ambiguous HLA calls can be frustrating when a research team needs allele-level clarity for cell line documentation, cohort analysis, immune assay interpretation, or gene editing support. But ambiguity does not automatically mean a run failed. In many cases, it reflects the biology and the evidence boundary: highly polymorphic HLA genes, heterozygous allele combinations, cis–trans phase uncertainty, variants outside the sequenced region, or imperfect matching to reference databases. The key question is not whether ambiguity exists, but whether it changes a decision in your study. This guide helps you classify common ambiguity types, estimate project risk, and decide when to ignore, review, confirm, or plan a higher-resolution strategy.

Disclosure: This educational content may link to CD Genomics service pages for readers who want operational support. Those links are provided for convenience and do not replace independent method selection, validation, or expert review.

Key takeaways for ambiguous HLA calls

Ambiguous HLA calls often occur because multiple allele combinations can explain the same sequence evidence.
Heterozygosity, phase uncertainty, limited region coverage, and rare or novel alleles are common sources of ambiguity.
The importance of ambiguity depends on whether it changes a research decision, not on whether it appears in the report.
Sanger confirmation can help answer targeted sequence questions, but it may not resolve long-range phase ambiguity.
Long-read sequencing is most useful when phasing or full-length HLA context is central to the project.
The most efficient strategy is to define critical samples and confirmation rules before sequencing begins.

Why HLA typing becomes a project question in cell therapy research

For many cell therapy research teams, high-resolution HLA typing is a way to turn immune background into a documented, comparable project variable. It helps reduce unknowns that otherwise surface later as irreproducible assays, confusing donor-to-donor differences, or hard-to-interpret engineered clones.

The practical question behind the search

Most teams are trying to answer a practical question: will HLA typing change what we do next? Common contexts include allogeneic CAR-T research models, iPSC-derived cell models, donor-derived primary cell studies, engineered (including HLA-modified) cell lines, and immune recognition or antigen presentation assays.

Across these scenarios, HLA typing typically supports sample characterization, model selection, editing planning, immune assay interpretation, and risk reduction across the project.

What "high-resolution" means in this article

HLA results are often reported at 2-field, 3-field, or 4-field resolution. You do not need a nomenclature tutorial to make a decision, but it helps to keep one rule of thumb in mind: 2-field often supports consistent sample annotation at a protein-difference level, while higher-field results can matter when allele-level context affects comparisons, tracking, or experimental design.

Cell therapy research creates HLA questions earlier than many teams expect

In many programs, the most important time to define HLA background is before immune assays and before editing, while your sample set is still flexible.

Donor-derived cells need traceable HLA backgrounds

Donor-derived primary cells are easy to treat as interchangeable until a co-culture readout becomes difficult to reproduce. If HLA background is not documented, differences in immune recognition context can become a hidden variable. The point is not clinical matching. It is to preserve interpretability when you compare donors, batches, or study sites.

iPSC-derived cell lines add another layer of HLA planning

iPSC-derived models can behave like a resource you will reuse, not a one-off sample. That makes traceable HLA background more valuable: ambiguous calls that are tolerable for an exploratory assay can become expensive later when you want the same line to function as a reference material.

A 2024 review of haplobank initiatives summarizes how HLA haplotype coverage is treated as a design variable when building iPSC resources (see: Current Landscape of iPSC Haplobanks (2024)). For research teams, the transferable insight is that HLA-informed planning often moves earlier as lines become long-lived resources.

Engineered cells need a baseline before editing

If a project involves class I or class II modification, B2M-related edits, HLA-E/HLA-G expression studies, or immune-evasive cell model design, baseline typing helps you avoid designing against an assumed sequence and helps you interpret edited clones consistently.

If your workflow includes edit verification, downstream validation via targeted sequencing such as CRISPR Validation Sequencing is easier to scope when baseline context is clear.

What high-resolution HLA typing helps project teams decide

The value of high-resolution typing is clearest when you can point to a decision it makes easier. And it is also where HLA typing ambiguity becomes meaningful: you are deciding what evidence is good enough for the decision, not chasing perfect certainty everywhere.

Which cell lines or donor materials are worth prioritizing

When you have multiple donor samples, primary cell batches, or cell line candidates, HLA typing supports structured comparison across loci such as HLA-A, HLA-B, HLA-C, HLA-DRB1, HLA-DQB1, and HLA-DPB1. It can also help you recognize patterns such as apparent homozygosity, partial shared haplotypes, or class I versus class II differences that matter for certain assay contexts.

Whether a lower-resolution result is enough

Not every study needs maximal resolution. A practical mapping is:

Research situation	HLA typing need
Basic sample annotation	2-field may be enough
Donor or cell line comparison	High-resolution typing is preferred
HLA editing or allele-specific studies	Allele-level information is often needed
Rare allele context or persistent ambiguity	Consider orthogonal confirmation

How HLA background shapes assay interpretation

HLA background can change antigen presentation context and immune recognition, which can shape T cell co-culture assay interpretation. That is why many teams treat HLA typing as part of immune assay metadata, alongside readouts such as receptor sequencing or transcriptomics. Related resources for multi-assay designs include BCR and TCR Sequencing and Single Cell RNA Sequencing.

Why HLA is technically difficult to type accurately

Typing difficulty is not a single lab issue. It is what you get when extreme polymorphism and complex heterozygosity collide with read-length limits, homology, and incomplete reference characterization.

The HLA region is one of the most polymorphic parts of the human genome

A reference database is not optional for HLA typing, and the database itself continues to expand. A Nucleic Acids Research update reported that the IPD-IMGT/HLA database contains more than 43,000 unique alleles (see: IPD-IMGT/HLA database: recent developments… (2025)). In practice, this scale is one reason rare HLA alleles and database-matching edge cases can produce ambiguous calls.

In addition, full-length characterization is not uniform across alleles. Work to resolve unknown nucleotides in full-length sequences highlights why some calls can remain hard to finalize in rare or poorly characterized contexts (see: Resolving unknown nucleotides in the IPD-IMGT/HLA Database (2024)).

Homology and phase can create ambiguous calls

An ambiguous call often appears when two allele combinations fit the same short-read evidence because distant polymorphisms are not connected on a single read pair. In HLA, that can present as a cis–trans phase ambiguity problem. In practical terms, HLA phasing is the missing link: without it, you may know which variants are present, but not which ones belong on the same allele.

Cis–trans phase ambiguity diagram for HLA typing

Long-read approaches can help because they provide longer-range context that supports phasing and full-length allele resolution. For example, a 2025 benchmark describes long-read HLA inference performance on PacBio HiFi and Oxford Nanopore datasets (see: Streaming long-read sequence alignments for HLA… (2025)).

Why project teams should think about ambiguity before the study starts

Ambiguity is easier to handle when it is planned for. The more critical the material (a master cell bank candidate, a key engineered iPSC clone, a rare donor-derived sample), the more valuable it is to define a confirmation rule up front.

When cell therapy teams should choose high-resolution HLA typing

High-resolution typing tends to pay off when decisions depend on allele-level background.

Scenario 1: Allogeneic cell model or donor material comparison

When you compare multiple donors or build internal panels, higher resolution reduces hidden variability and makes repeated immune assay comparisons more interpretable.

Scenario 2: iPSC-derived cell line research

If iPSC-derived lines are intended to be reused or shared across internal studies, the bar for traceability is higher. HLA-informed planning is increasingly discussed in the context of iPSC resource design (see the same 2024 review cited earlier).

Scenario 3: HLA editing or immune-evasive cell engineering

If the research goal involves HLA knockout, retention, replacement, or expression modulation, baseline typing helps reduce avoidable rework in guide design and clone interpretation.

If you are planning downstream review, CRISPR Off-Target Validation Sequencing can be positioned as a separate validation step after baseline documentation.

Scenario 4: Antigen presentation or T cell recognition studies

HLA genotype defines the peptide presentation context in which immune recognition happens. For co-culture studies and antigen presentation models, HLA typing can be a key interpretation variable.

What loci and resolution should be considered

The right loci and resolution set should follow the research question.

Class I loci: HLA-A, HLA-B, and HLA-C

Class I loci often matter most for T cell-related assays, donor-derived immune cell studies, and class I editing contexts.

Class II loci: HLA-DRB1, HLA-DQB1, HLA-DPB1, and related loci

Class II loci often matter more in antigen-presenting cell models, CD4+ T cell-related research, and immune activation contexts.

2-field vs 4-field: a practical decision rule

A pragmatic approach is to set different resolution floors for different decisions: early sample annotation can often start at 2-field, while comparison studies, engineered cell line development, and long-lived reference resources benefit from higher-field typing. When a rare allele signal or persistent ambiguity is likely to change decisions, consider full-gene context or orthogonal confirmation.

How to triage ambiguous HLA calls

Ambiguity should be treated as a decision point, not an automatic trigger for the most complex sequencing option.

Decision tree for triaging ambiguous HLA calls

A useful triage sequence is:

Does the ambiguity change the research decision? If not, document it and move on.
Is the sample critical or irreplaceable? If yes, your tolerance for ambiguity is lower.
What is the likely source of ambiguity? Phase, limited region coverage, homology, or a rare/novel allele context point to different next steps.

Ambiguity type	What it often looks like in a report	When it can change a decision	Common next step
Phase-limited (cis–trans unresolved)	Two (or more) allele pairs fit the same short-read evidence; alternatives differ at distant polymorphisms	When you need allele-specific interpretation, haplotype tracking, or edit design against a specific allele	Long-read typing for phasing, or targeted long-range confirmation if available
Region-limited coverage	Multiple candidate alleles share identical sequence in the sequenced region; discriminating variants lie outside captured exons/amplicons	When 3–4 field resolution is required for comparison, traceability, or downstream annotations	Re-run with expanded locus/amplicon coverage or a panel designed for higher resolution
Repeat region ambiguity (RRA)/homopolymer-like ambiguity	Ambiguity clusters around repeat regions; call strings group several alleles with equivalent evidence	When a repeat-associated difference affects your downstream grouping, tracking, or allele-frequency reporting	Apply RRA-aware labeling, consider orthogonal confirmation for the specific repeat region
Rare/novel allele or database-edge case	Low-confidence assignment; best matches depend strongly on database version; "closest allele" behavior	When you are building a long-lived reference material, or when "rare vs common" changes inclusion/exclusion criteria	Update database and re-call; consider long-read full-gene context; confirm critical positions with targeted sequencing
Apparent homozygosity vs allele dropout	One allele dominates; secondary allele evidence is weak or missing	When sample provenance, QC flags, or downstream immunology assumptions depend on true zygosity	Review coverage/QC; repeat library prep or confirm with an orthogonal method before concluding homozygosity

Some ambiguity categories are also worth labeling explicitly. A 2024 commentary discusses repeat region ambiguities (RRAs) in HLA typing and proposes clustering related alleles into short RRA strings to prevent misleading downstream interpretation (see: Identification of Repeat Region Ambiguities in HLA typing… (2024)).

Research impact matrix for ambiguous HLA calls

How to build HLA typing into a cell therapy research workflow

The biggest usability gains come from planning how results will be used.

Step 1: Define the research decision

Write down what decision the HLA data must support: comparing donor materials, selecting iPSC candidates, designing edits, interpreting immune recognition assays, or building a reference panel.

Step 2: Choose the sample and metadata set

Plan the sample list and metadata as if you will need to reanalyze the project later. Minimal metadata that prevents confusion includes sample source, donor/clone identifiers, passage information, editing status, target loci, and assay context. When projects include high-dimensional data layers, consistent tracking becomes even more important.

Step 3: Select a typing strategy

Short-read NGS is a strong fit for many high-resolution typing programs. Sanger confirmation is best used for targeted sequence questions. Long-read sequencing is most useful when full-length context and phasing are core requirements.

A practical reason is read length. A 2023 review summarizes typical read-length differences between short- and long-read approaches (see: Next-Generation Sequencing Technology: Current Trends… (2023)). In HLA typing, this affects whether distant polymorphisms can be linked confidently.

Step 4: Plan how results will be used

If you want the data to reduce rework, plan outputs that plug into your project system, such as a sample annotation sheet, a candidate comparison table, and an edit design record. For teams integrating multiple modalities, links to relevant data layers such as Single Cell Genome Sequencing can help clarify how sample identifiers and metadata are shared.

A practical example: HLA typing before selecting iPSC-derived cell candidates

This is a hypothetical planning example.

Project situation

A biotech research team has 12 iPSC-derived cell candidates and wants to select 3–4 for downstream immune recognition assays and a gene editing feasibility study. Baseline QC exists, but HLA background is incomplete.

HLA typing questions

They want to know which candidates share similar class I background, whether any show apparent homozygosity, which results contain ambiguity likely to change selection or editing design, and which candidates are suitable as internal reference materials.

How results change the next step

After typing, the team builds a comparison table across class I and class II loci and triages candidates into Priority A, Priority B, and Requires confirmation. Requires confirmation candidates are the ones where ambiguity is likely to change selection, interpretation, or editing design. The follow-on action is chosen based on why the ambiguity exists: targeted confirmation for narrow questions, and long-read support when phase or full-length context is the limiting factor.

Mini case: When a phase ambiguity does matter

A research group was selecting a small set of iPSC-derived candidates to serve as long-lived reference materials for repeated immune recognition assays. Initial short-read HLA typing returned a clean 2-field call at most loci, but one class I locus produced an ambiguous allele-level result where two alternative allele pairs explained the same evidence. For exploratory assays, the team could have documented the ambiguity and moved on.

However, because the line was intended to be reused across studies and referenced in internal documentation, the ambiguity would have made future cross-batch comparisons and allele-specific annotations inconsistent. The team therefore treated this as a decision-changing ambiguity and escalated to a phasing-capable approach to connect distant polymorphisms on the same haplotype. Once the phase was resolved, they froze the final annotation rule for that line (what resolution must be reported, what triggers re-review, and how any future database updates would be handled), reducing downstream rework when new assays were added.

What to discuss with a sequencing partner before starting

Before outsourcing HLA typing, it is more efficient to define scope in decision terms.

The minimum information to prepare

A concise checklist: research goal, cell type/model, sample type, DNA availability, target loci, desired resolution, number of samples, whether editing is involved, whether ambiguity resolution may be required, and how results will be used downstream.

Questions to ask before ordering

Ask what loci are covered, what resolution can be reported per locus, how ambiguous calls are labeled and handled, what confirmation options exist, and what files and summary tables will be delivered.

What a useful report should enable

A useful report should support comparison, documentation, editing design records, integration with immune assay metadata, and reanalysis later if the project direction changes.

How CD Genomics supports cell therapy research teams

CD Genomics can help research teams plan high-resolution HLA typing projects around cell source, target loci, resolution needs, and downstream use of the data. Services are intended for research use only.

Where the service fits in the research workflow

Common research scenarios include donor-derived cell characterization, iPSC-derived cell line research, engineered cell line baseline typing, HLA editing support, and immune recognition study support. A starting point is the HLA Typing Sequencing service page.

When to request method guidance

If you are deciding between short-read NGS, targeted confirmation, and long-read support, it helps to share sample information, target loci, required resolution, number of samples, the project decision the data must support, and whether editing or immune assay interpretation is involved.

FAQ

Do all cell therapy research projects need high-resolution HLA typing?

No. Early exploratory experiments may only require basic sample annotation. High-resolution typing becomes more valuable when the project relies on donor comparison, engineered cell lines, iPSC-derived models, editing plans, or immune recognition assays where allele-level background changes how results should be interpreted.

Which HLA loci matter most for cell therapy research?

In many T cell-focused assay contexts, class I loci (HLA-A, HLA-B, and HLA-C) are the core loci because they define class I antigen presentation background. Class II loci such as HLA-DRB1, HLA-DQB1, and HLA-DPB1 can become more relevant in antigen-presenting cell models, CD4+ T cell-related studies, or co-culture systems where class II presentation is a meaningful variable.

Is NGS enough for HLA typing in cell therapy research?

Often, yes. Short-read NGS can generate high-resolution calls for many projects. But if your sample shows complex heterozygosity, rare allele signals, or phase ambiguity that affects the study decision, then targeted confirmation or long-read sequencing can be useful because it adds evidence that short reads cannot always provide.

When should HLA typing be done before CRISPR editing?

If you are editing HLA loci or a related immune recognition pathway, typing is most useful before guide design and clone selection. Baseline allele-level context reduces the chance of designing against an assumed sequence and helps you interpret edited clones in a consistent way across the project.

Can HLA typing be combined with immune repertoire or RNA-seq data?

Yes. In multi-omics designs, HLA typing can be treated as sample metadata that supports interpretation of immune receptor data, immune recognition assays, antigen presentation context, and transcriptomic readouts. The practical benefit is not that HLA typing explains all variability, but that it removes a major unknown when integrating results across assays.

For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.

Related Services

Inquiry