HLA Typing Results and Data Delivery

HLA Typing Results and Data Delivery

At a glance:

HLA typing results and data delivery cover image showing allele calls, resolution, QC, database version, ambiguity notes, and phasing

Researchers often put most of their attention on sequencing chemistry and read length. But the practical value of HLA typing lives downstream: how allele calls are represented, what "resolution" means in the report, what gets flagged as ambiguous or no-call, and what supporting evidence/QC is available for review.

This guide is written for research use only (RUO). It explains common reporting conventions for HLA typing results and how to interpret them in a research setting. It does not provide clinical donor matching, transplant eligibility, or patient-care guidance.

Quick answer: what is the final output of HLA typing sequencing?

In most research HLA typing workflows, the final output centers on allele calls per locus (your HLA genotype) and the resolution level those calls support (for example, 2-field versus 4-field nomenclature). Many providers also include supporting elements—such as QC summaries, ambiguity/no-call notes, database/software versioning, and brief method notes—but the exact contents and file formats depend on the service scope and should be confirmed during quotation.

In practice, you can think of the deliverable as three layers:

  1. Primary result layer (always expected): locus list + allele calls + achieved resolution
  2. Interpretation layer (strongly recommended): ambiguity reporting, phasing notes, and what the call means/doesn't mean
  3. Evidence layer (scope-dependent): QC metrics, coverage summaries, workflow/software versions, and (sometimes) access to raw/processed sequencing data

If you're requesting a quote for a large cohort (dozens to thousands of samples), also ask how results are organized across batches and whether a sample manifest (sample IDs, lanes/runs/batches, and per-sample status) is included.

HLA typing results: the core fields you should look for

How to read an HLA allele call

An HLA "allele call" is a standardized name assigned to a specific HLA sequence variant at a specific locus (for example HLA-A, HLA-B, HLA-DRB1). Modern naming is maintained through official nomenclature and reference databases; for human HLA alleles, the IPD-IMGT/HLA database (EMBL-EBI) is the canonical public reference used widely for nomenclature and sequence definitions in research contexts.IPD-IMGT/HLA Database (EMBL-EBI)

A typical allele call looks like this:

  • HLA-A*02:01:01:01

It contains:

  • Locus (HLA-A)
  • An asterisk (*)
  • Fields separated by colons (02:01:01:01)

Those fields correspond to progressively finer distinctions in the allele definition. A clear way to remember it is:

  • Earlier fields: what protein variant it encodes
  • Later fields: which synonymous or non-coding variants distinguish it from other alleles that look the same at the protein level

Many HLA reports also carry suffix letters (for example, indicating expression status such as null alleles). When present, those suffixes are part of the allele name and should be retained during downstream analysis and reporting.

Key Takeaway: Treat an HLA allele call as a versioned label that depends on both your observed sequence and the reference database used to name it.

What 2-field, 3-field, and 4-field results mean

Most readers first encounter HLA "resolution" as a shorthand like "2-field" or "4-field." In current naming conventions, the number of fields describes how many colon-separated blocks are being specified in the allele name.

A commonly cited explanation of the four-field HLA allele nomenclature is summarized in peer-reviewed tutorials and methods papers (for example, a 2023 PMC statistical genetics tutorial that describes the field structure and its meaning).a 2023 statistical genetics tutorial describing four-field HLA nomenclature

Table: HLA field level vs interpretation

Field level reported Example format What it usually distinguishes What it does not guarantee
1-field HLA-A*02 Broad allele group/family Protein-level uniqueness; phase; full gene coverage
2-field HLA-A*02:01 Typically distinguishes amino-acid (protein) differences Non-coding differences; all synonymous differences; full-length phasing
3-field HLA-A*02:01:01 Adds synonymous coding differences (same protein) Intronic/UTR differences; complete full-gene identity
4-field HLA-A*02:01:01:01 Adds non-coding differences (introns/UTRs), depending on what was sequenced That every sample/locus will reach 4-field; that there is zero ambiguity

How to interpret this table in practice: field level is a statement about what was confidently resolved, not a promise about what always can be resolved. Two samples can be processed with the same experimental design yet achieve different effective resolution at different loci because of coverage distribution, allelic imbalance, or locus-specific complexity.

A note on "4-digit" and "high-resolution" language

Older shorthand such as "2-digit" or "4-digit" is still used in some contexts, but it can be confusing—especially when you're comparing deliverables across studies that used different typing technologies.

  • "2-digit" often maps to something like 1-field (broad group)
  • "4-digit" often maps to something like 2-field (protein-level allele)

However, these digit labels are not a substitute for stating the field level actually reported and the sequence scope (exons-only vs full gene). If you're comparing vendors or methods, ask them to confirm resolution explicitly as field-based reporting (for example, "2-field" or "3-field"), plus how ambiguous results are represented.

Why "more fields" can matter in research

Higher-field resolution can matter when:

  • you're tracking fine-grained population genetics or allele frequency differences
  • you need reproducible mapping between cohorts typed at different times
  • you're building a reference panel or a dataset intended for re-use, where future re-analysis may depend on knowing what was (and wasn't) sequenced

But higher-field reporting is not automatically "better" for every research question. The right target is the one that matches your downstream analysis and your tolerance for ambiguity.

Database versions: why two correct reports can look different

A point that's easy to miss during procurement is that allele naming is database-dependent. For human HLA, most workflows reference IPD-IMGT/HLA releases for allele definitions and naming. IPD-IMGT/HLA is maintained as a versioned database, with ongoing updates and releases by EMBL-EBI/IMGT.

What that means for your lab notebook and data management:

  • Two projects can both be "correct" yet use slightly different allele name strings if they used different reference releases.
  • When you compare cohorts across time (or across providers), the safest practice is to record the reference database name and release/version in the deliverables and in your internal metadata.

If a quote doesn't explicitly state whether the report will include reference database versioning, it's worth asking for it up front—because adding it later is harder than capturing it at the moment results are generated.

For background on how IPD-IMGT/HLA is structured and maintained, see the IPD-IMGT/HLA "Database and Genomics" help page.

What may be included in an HLA typing report?

A research HLA typing report is usually a structured summary intended to be reviewable by scientists and (in large programs) by procurement/compliance stakeholders. That said, there is no single universal template, and exact contents and file formats vary by provider and project scope.

If you're managing a large study (for example, a screening cohort or multi-site collection), "final output form" often means more than one file or table. Beyond per-sample allele calls, you may need:

  • a sample manifest that ties your sample IDs to run/batch identifiers and result status
  • consistent, machine-readable columns for locus coverage and call status (called / ambiguous / partial / no-call)
  • a place to capture notes that matter for downstream filtering (for example, "low coverage at locus X")

Because these cohort-scale logistics vary widely, it's reasonable to ask for a short example of the column headers or a minimal schema during quotation—without assuming a specific file type.

Below is a practical way to think about "report elements"—the fields that commonly appear, what they mean, and why they matter when you're interpreting allele calls.

Long-read HLA typing report elements including allele calls, field resolution, and QC notes HLA typing reports should make loci, allele calls, resolution, and interpretation notes easy to review.

Table: Report element vs what it means vs why it matters

Report element (typical) What it means Why it matters for interpretation
Sample identifier(s) Your project's sample IDs, sometimes plus internal run IDs Prevents mix-ups; critical for cohort-scale studies and re-contact
Loci typed Which HLA genes were attempted/reported (e.g., A, B, C, DRB1, DQB1) Avoids false assumptions ("no result" vs "not targeted")
Allele calls per locus The called alleles (usually two for diploid loci) The primary research result
Resolution / field level The reported nomenclature detail level per locus Tells you what downstream comparisons are valid
Ambiguity notes / alternative alleles Whether multiple allele explanations fit the observed data Prevents over-interpretation; guides follow-up strategy
No-call / partial call flags Locus attempted but not confidently called Useful for planning re-sequencing or excluding loci in analysis
Phasing status (scope-dependent) Whether variants are assigned to haplotypes (cis/trans resolved) Reduces genotype ambiguity; helps explain why some calls remain ambiguous
QC summary (scope-dependent) Read depth/coverage/quality indicators used to accept results Lets you interpret confidence and troubleshoot low-performing samples
Database/reference version Which HLA reference database release was used for allele naming Makes results reproducible across time and prevents naming drift
Software/workflow version Tool and pipeline version used for calling Important because algorithm differences affect ambiguity handling
Method notes Target region (exons vs full gene), sequencing design, key limitations Explains expected resolution and common failure modes

How to interpret this table in practice: A "good" report is not only a list of alleles—it's a record of what was tested, what level of certainty was reached, and what assumptions the allele naming depends on (database + workflow). If any of those are missing, you'll spend time reverse-engineering what the allele call actually represents.

Why long-read sequencing can help result interpretation

HLA loci are among the most polymorphic regions in the human genome, and they create well-known challenges for sequence-based typing: high diversity, closely related alleles, and ambiguity when a tested region doesn't contain enough distinguishing information.

Long-read sequencing can help by increasing the chance that the data spans multiple informative variants on the same molecule, which supports phasing—determining which variants co-occur on the same haplotype. A peer-reviewed review on nanopore/long-read HLA typing describes how long reads can improve high-resolution typing and reduce certain ambiguity classes, while also emphasizing ongoing limitations such as platform error profiles.a 2020 review on nanopore/long-read HLA typing

Here's what that means in report terms:

  • Fewer "either/or" allele sets in some cases, because phase and full-length context can disambiguate allele combinations that look identical in short segments
  • More interpretable "why this is ambiguous" notes, because ambiguity often maps to specific unsequenced regions or unresolved phase blocks
  • Better traceability for full-gene differences, which can matter when two alleles are identical in peptide-binding exons but differ elsewhere

At the same time, long reads do not turn HLA typing into a magic, ambiguity-free output. Remaining limitations can include:

  • uneven locus coverage and allele imbalance
  • platform-specific error patterns (for example, homopolymer sensitivity)
  • reference/database limitations (naming depends on what is known and curated)

Pro Tip: If your downstream analysis depends on specific ambiguity being resolved, ask in advance how ambiguous calls will be represented (alternative alleles vs groups) and what follow-up options exist within your project's scope.

What to clarify before ordering

When someone asks, "What will the final output form be?" they're usually trying to avoid one of two failure modes:

  1. The project finishes, but the "deliverable" doesn't contain the fields needed for downstream analysis.
  2. The project finishes, but internal reviewers (PI, bioinformatics lead, compliance, procurement) can't audit how calls were made.

This is especially common in large-sample requests, where you need to know not only "what each sample looks like," but also how results are delivered across batches.

Table: Questions to ask before ordering HLA typing data delivery

Question to confirm What a clear answer looks like Why it matters
Which loci will be reported? A locus list (class I and/or class II) and whether any are optional Prevents gaps and mismatched expectations
What resolution target do you support for each locus? Field-based targets (e.g., 2-field, 3-field, 4-field) and when it may drop Avoids "guaranteed 4-field" assumptions
How are ambiguous calls represented? Alternative allele strings and/or standardized groups; explicit no-call policy Downstream pipelines must handle ambiguity
Will you report database/reference version? Named database + release/version included in results Enables reproducibility across time
Will you report software/workflow version? Tool/pipeline name + version or workflow identifier Helps interpret differences across batches or reanalysis
What QC is included (minimum)? At least per-sample pass/fail notes; optionally depth/coverage summaries Explains confidence and failure modes
Do you provide raw sequencing data? Clear yes/no; if yes, specify scope (raw reads vs processed) during quotation Needed for internal reanalysis or method validation
How will results be organized for large cohorts? Sample manifest, batch/run identifiers, and consistent sample naming rules Prevents operational chaos at scale
What metadata do you need from us? Sample type, extraction method, concentration/quality metrics, and manifest fields Prevents avoidable failures and delays

How to interpret this table in practice: if you can answer these questions before ordering, you can usually draft a one-page internal "deliverables spec" that keeps the project aligned across wet lab, bioinformatics, and procurement.

For project context, you can reference the CD Genomics service page for long-read HLA typing here: CD Genomics long-read HLA typing. (Final deliverable contents should still be confirmed during quotation because project scope and options vary.)

How to talk about "raw data" without assuming file types

Many quote requests ask: "Will we get FASTQ/BAM?" If you don't have confirmed details yet, it's safer to ask for raw reads or processed alignment/consensus outputs by category, then confirm the exact file formats in the quote.

Example wording for a quote request:

  • "Please confirm whether raw sequencing reads and any processed alignment/consensus outputs are included, and provide the deliverable list for the report and any accompanying data tables."

Limitations and RUO note

This article is for research-use-only interpretation of typical HLA typing outputs.

  • It does not provide clinical donor-recipient matching guidance, transplant eligibility decisions, or diagnostic interpretation.
  • HLA report contents and data delivery formats can vary by project scope, loci, and resolution targets.
  • If your study requires a specific reporting level or a specific set of output fields, confirm those items during quotation and document them in the statement of work.

FAQ

What is the final output of HLA typing?

The final output is typically a per-sample, per-locus table of allele calls (your HLA genotype) plus an explicit indication of the resolution level achieved (for example, 2-field versus 4-field naming). Many research reports also include supporting notes—such as which loci were attempted, whether any results are ambiguous or no-call, and what reference database version was used for naming. The exact report layout and accompanying data files vary by provider and project scope, so the safest approach is to request a deliverables list during quotation and confirm how ambiguity and QC will be documented.

What does 4-field HLA typing mean?

"4-field" refers to reporting an allele name with four colon-separated fields (for example, HLA-A*02:01:01:01). In general, earlier fields tend to capture protein-level differences, while later fields distinguish synonymous coding changes and non-coding differences (such as introns or UTRs), depending on what was sequenced and how the allele was assigned. In research projects, a key point is that "4-field supported" does not automatically mean every locus in every sample will reach that resolution—coverage, locus complexity, and ambiguity can reduce effective resolution for specific calls. Ask providers to state resolution per locus.

Will I receive raw sequencing data?

It depends on the service scope and what is agreed in the quotation. Some projects deliver only a final report/table of allele calls, while others can include raw reads and/or processed data outputs for internal reanalysis and validation. If raw data is important for your study, request it explicitly and ask what categories of data are included (raw reads versus processed alignments/consensus sequences) and how results are organized for large cohorts. Because file formats and transfer methods can differ by project and compliance requirements, confirm the exact data package and delivery method during quotation.

What happens if an allele call is ambiguous?

An ambiguous call means the observed sequencing evidence matches more than one possible allele explanation at the requested resolution. In a report, ambiguity may be represented as a set of alternative alleles, a grouped representation, or a locus flagged for partial/no-call depending on project policy. Ambiguity can occur when the sequenced region lacks discriminating positions, when phase cannot be resolved, or when coverage/quality is insufficient for a unique call. If your downstream analysis needs a single allele per locus, clarify in advance how ambiguity will be handled, whether follow-up sequencing/analysis options exist, and how ambiguity is encoded in the final table.

Does long-read sequencing remove all HLA ambiguity?

No. Long reads can reduce some common ambiguity types by improving phasing and providing longer contiguous sequence context, which helps separate variants into the correct haplotypes. However, ambiguity can still occur due to uneven coverage, allelic imbalance, platform-specific error patterns, and reference database limits (allele naming depends on curated sequences). A well-designed long-read HLA workflow should therefore still report ambiguity/no-call conditions transparently rather than implying a perfect "always 4-field, always unique" outcome. If ambiguity resolution is critical for your study endpoints, confirm the expected ambiguity policy during quotation.

Can HLA typing results be used clinically?

Not from a research-use-only service and not from this article's guidance. Clinical decisions (such as donor matching, transplant eligibility, or diagnostic interpretation) require validated clinical workflows, regulated reporting, and clinical context that is outside the scope of RUO sequencing deliverables. Research HLA typing outputs are valuable for study design, mechanistic research, cohort stratification, and assay development—but they should not be treated as a substitute for clinical-grade testing. If your project has any clinical implications, discuss requirements with qualified clinical laboratories and confirm the appropriate intended-use and regulatory framework.

Next steps

If you're preparing a quote request, keep it simple and specific:

Share your requested loci, resolution target, sample number, and desired report/data format to discuss long-read HLA typing deliverables.

Author

Dr. Yang H., Senior Scientist at CD Genomics — LinkedIn profile

Dr. Yang's work focuses on long-read sequencing study design, research-grade HLA typing workflows, and interpreting technical outputs (allele calls, resolution, QC, and ambiguity notes) for reproducible downstream analysis.

For Research Use Only. Not for use in diagnostic procedures.
Talk about your projects

For research purposes only, not intended for personal diagnosis, clinical testing, or health assessment

Get Your Instant Quote