Decoding HLA Typing Sequencing Results: From Allele Nomenclature to Report Interpretation

As the core component of the major histocompatibility complex (MHC), the accurate interpretation of the Human Leukocyte Antigen (HLA) system is the core premise of organ transplantation matching, disease-related research, and immunotherapy. With the wide application of high-throughput sequencing techniques in HLA typing, understanding the basic structure, naming rules and clinical significance of HLA typing results has become an essential skill for researchers and clinicians.

This paper systematically introduces the interpretation system of HLA typing and sequencing results, and provides comprehensive guidance from allele naming to report interpretation for organ transplantation matching and disease association research.

Basic Composition of HLA Typing Results

The result of HLA typing is the key basis for immune genetic research and clinical application, and its basic composition contains accurate genetic information. Based on gene loci and standardized allele naming, it forms a molecular map reflecting individual immune characteristics, which lays a foundation for analyzing immune mechanisms and clinical decision-making.

Structural and Functional Classification of Gene Loci

The HLA system is located in the short arm of human chromosome 6 (6p21.3), which contains more than 200 gene loci. According to its function and polymorphism, the HLA system can be divided into two categories: classical genes and non-classical genes. Classical HLA genes play a key role in immune response, including class I (HLA-A, HLA-B, HLA-C) and class II (HLA-DR, HLA-DQ, HLA-DP) loci. The molecules encoded by these genes play an important role in antigen presentation, and are highly polymorphic, which enables different individuals to recognize and deal with a variety of antigens. Non-classical genes, such as HLA-E and HLA-F, have relatively low polymorphism and mainly play a role in immune regulation.

In HLA typing results, classical loci are the core focus of clinical attention. Especially HLA-A, HLA-B, HLA-C, HLA-DRB1, and HLA-DQB1, the polymorphism of these loci is directly related to the probability of transplant rejection and the susceptibility to diseases.

The outcome of HLA typing (Lee et al., 2024) Result of HLA typing (Lee et al., 2024)

International Standard System for Allele Naming

The naming of HLA alleles strictly follows the standardized rules formulated by the World Health Organization (WHO) HLA Naming Committee and will be updated regularly based on the IMGT/HLA database (International Immunogenetics Information System) to ensure the accuracy and timeliness of naming. Its naming rules adopt a hierarchical structure, which makes the allele information clearly and orderly presented, mainly including the following aspects.

  • Locus: Indicates the chromosome region where HLA gene is located, such as HLA-A, HLA-B, HLA-DRB1, etc.
  • Locus group: Representing the main types of alleles, usually represented by numbers, such as HLA-A*01 representing group 01 of A locus.
  • Allele: Used to distinguish different alleles in the same locus group, such as HLA-A01:01 indicating the first allele in the A01 group.
  • Subtype: Further distinguish the sequence differences of alleles, for example, HLA-A01:01:01 represents a subtype of A01:01.
  • Modifier: Used to express the special characteristics of alleles, such as n (no function), l (low expression), etc.

The nomenclature of HLA (Xie et al., 2010) HLA nomenclature (Xie et al., 2010)

How to Analyse HLA Typing Report

HLA typing report is the key basis of clinical transplantation matching and disease association research, which contains multi-dimensional information such as gene loci and alleles. Due to the complex report structure and intensive technical terms, it is necessary to establish a systematic reading strategy to accurately extract key information and guide decision-making.

Resolution Hierarchy and Data Reliability Identification

The resolution of the HLA typing report directly determines its clinical application value, which is usually divided into three levels:

  • Serological level: Only antigen group (such as HLA-B44) can be determined, which is obtained by traditional serological methods or low-resolution genotyping.
  • Gene level (2-digit): Such as HLA-B*44, obtained by PCR-SSP or short reading length sequencing.
  • Sequence level (4-digit and above): Such as HLA-B*44:02, depending on high-throughput sequencing and bioinformatics analysis.

The resolution identification and data quality parameters that should be paid attention to in the report are as follows:

  • Coverage index: Class I loci are recommended to be ≥200×, Class II loci are ≥300×, and less than 100× may lead to ambiguous typing.
  • Base quality: The average Phred score of hypervariable regions (such as HLA-A exon 2) should be ≥30, and the regions below 20 should be wary of sequencing errors.
  • Allele support: The number of supported reads for each allele should be ≥10, and the heterozygote reads ratio should be in the range of 0.5-2.0. Deviation from this range may indicate that the sample is abnormal.

A functional analysis of typing accuracy with respect to coverage depth (Ka et al., 2017) Analysis of typing accuracy as a function of coverage depth (Ka et al., 2017)

Interpretation Process of Clinical Decision-oriented Report

In the organ transplantation matching, the interpretation of the report should follow the logic followed.

  • Priority of core loci: First, confirm the 4-digit typing results of HLA-A, HLA-B, HLA-C, and HLA-DRB1. The mismatch of these four loci is the main cause of transplant rejection.
  • Resolution verification: Check whether the key loci meet the resolution required by the clinic (for example, 4 digits are required for kidney transplantation, and 6 digits are required for hematopoietic stem cell transplantation), and the reasons should be indicated if they fail to meet the standard.
  • Mismatch analysis: Calculate the number of amino acid mismatches between donors and recipients, focusing on the antigen-binding groove region (such as amino acids 67-74 of HLA-DRB1), where the mismatch is more likely to trigger an immune response.
  • Fuzzy result processing: For the loci marked as Ambiguous, it is necessary to evaluate their influence on matching.

In the study of disease association, the focus of interpretation has shifted to allele frequency and population genetics analysis.

  • The HLA allele frequencies of the target disease group and the control group were compared, and the statistical significance was evaluated by chi-square test or Fisher exact test.
  • Pay attention to the relationship between "shared epitopes" (such as the QKRAA sequence of HLA-DRB1*04:01) and autoimmune diseases, which often participate in the key interaction of antigen presentation.

For tumor samples, it is necessary to compare the HLA typing between tumor and normal tissues and identify the loss of heterozygosity (LOH) region, which may suggest the mechanism of tumor immune escape.

All imputed four-digit HLA alleles were associated with PheWAS phenotypes according to HLA locus (Karnes et al., 2017) Association of all imputed four digit HLA alleles with PheWAS phenotypes by HLA locus (Karnes et al., 2017)

Identification and Explanation of HLA Homozygote and Heterozygote

The HLA system is one of the most polymorphic genetic regions in the human body, and the identification of its special genotype is very important for analyzing the immune response, disease susceptibility, and transplantation adaptability. These genotypes, including homozygotes, rare allele combinations, and specifically linked haplotypes, show unique biological significance in population evolution, disease occurrence, and immune regulation, and become the key hub to connect genetic variation and phenotypic differences.

Molecular Characteristics and Clinical Effects of Homozygotes and Heterozygotes

The frequency of HLA homozygotes (two alleles at the same locus are the same) in the population is about 5%-15%, and its identification needs to be combined with sequencing data and statistical analysis.

  • Molecular characteristics: Homozygotes show high coverage of a single allele, while heterozygotes show a uniform distribution of reads of two alleles (ideal ratio of 1:1). When the heterozygote reads ratio deviates from the range of 0.8-1.2, we should be alert to allele deletion or tumor LOH.
  • Clinical risk: Homozygous individuals decrease the diversity of immune recognition and increase the susceptibility to infectious diseases. Studies have shown that the clearance efficiency of HLA-B*15 homozygote is 40% lower than that of heterozygote, but in organ transplantation, homozygote donors may reduce the number of mismatches of recipients and reduce the risk of rejection.
  • Evolutionary significance: Some homozygous haplotypes (such as HLA-A01-B08-DRB1*03 in the Nordic population) are related to historical survival advantages, which may be preserved through the population bottleneck effect, and their high-frequency existence suggests that they are related to the resistance mechanism of specific pathogens.

Haplotype Inference and Linkage Disequilibrium Analysis

The linkage imbalance of HLA genes makes haplotype (gene combination on the same chromosome) inference the key to interpretation.

  • Technical method: Long reading and long sequencing (such as PacBio HiFi) can directly read complete haplotypes, while short reading and long data are inferred based on linkage disequilibrium by relying on algorithms such as PHASE and fast phase.
  • Clinical application: In hematopoietic stem cell transplantation, the HLA haplotype matching degree of donor and recipient is more important than single locus matching, and complete haplotype matching can significantly reduce the incidence of GVHD.
  • Population genetics: Haplotype frequency analysis can trace the migration history of a population. For example, HLA-A33:03-B58:01 haplotypes are high in the East Asian population, but rare in the European population. This distribution difference can be used as a genetic marker of population differentiation.

The validation of the HLA-Upgrade module was conducted in the European (EUR, left) and African - American (AFA, right) populations, with the post - probability threshold set at 0% (Geffard et al., 2020) Validation of the HLA-Upgrade module in the European (EUR, left) and African-American (AFA, right) populations (post-probability threshold set at 0% (Geffard et al., 2020)

How to Deal with Ambiguous HLA Typing Results

Ambiguity in HLA typing is a key challenge that affects data accuracy and clinical decision-making. Such results are caused by the limitation of sequencing and the high similarity of alleles. If not handled properly, it may lead to transplantation matching errors or deviation in disease association research, so it is very important to establish a systematic treatment strategy.

Generation Mechanism and Classification of Ambiguous Results

The ambiguous results in HLA typing mainly come from three mechanisms: the limitation of sequencing data, the high similarity of alleles, and the inherent deviation of the algorithm model. According to the degree and reason of ambiguity, it can be divided into the following categories.

  • Insufficient coverage type ambiguity: Due to insufficient sequencing depth or low capture efficiency of the target region, key polymorphic sites are not covered.
  • Fuzzy sequence similarity: The sequences of the two alleles are the same in the coverage area of sequencing, but there are differences only in the uncovered area.
  • Ambiguity of heterozygote: In the heterozygote sample, the distributions of the reads of two alleles are close, and there are many possible combinations of alleles to satisfy the distribution.
  • The limitation of the algorithm is vague: Because the hypothesis of the algorithm model (such as HWE balance) does not match the actual sample, the genotype inference is wrong.

Hierarchical Solution Strategy of Ambiguous Results

To deal with the ambiguous classification results, it is necessary to adopt a hierarchical strategy and gradually optimize it from the data level to the algorithm level.

  • A. Data level optimization
    • a) Increase the sequencing depth: Targeted resequencing of ambiguous loci will increase the coverage to more than 500×, especially paying attention to hypervariable regions (such as exon 2-3 of the HLA-I gene and exon 2 of the HLA-II gene).
    • b) Use long reading and long sequencing instead: For key samples, PacBio HiFi or Oxford Nanopore were used to directly read the complete allele sequence to avoid the assembly ambiguity of short reading and long reading.
    • c) Single-molecule amplification: By single-cell PCR or molecular cloning technology, the HLA gene of a single chromosome is isolated and sequenced, and the haplotype information is directly obtained.
  • B. Algorithm level optimization
    • a) Multi-tool cross-validation: OptiType, Kourami, and HLALA were used for typing at the same time, and the results were compared. For example, if the results of OptiType and Kourami are consistent at a certain locus, but the results of HLALA are different, the first two results will be given priority.
    • b) Custom reference database: According to the research needs, build a custom database containing specific alleles. When studying a specific population, add the high-frequency alleles of this population that are not included in the public database.
    • c) Adjust algorithm parameters: Adjust the parameters of typing tools for ambiguous loci. If the "--allow-multiple" parameter is added in OptiType, multiple possible genotype combinations can be output.
  • C. Experimental verification
    • a) Sanger sequencing: Sanger sequencing of ambiguous regions and direct reading of DNA sequences are the gold standard to verify HLA typing.
    • b) Allele-specific PCR: Design specific primers for fuzzy alleles, and verify the existence of target alleles by PCR amplification.
    • c) Molecular cloning and sequencing: The PCR product was cloned into the vector, and the sequence of each allele was determined by monoclonal sequencing.

Histogram depicting the HLA-check distance (Jeanmougin et al., 2017) Reduction of Ambiguous Allele* Combinations with the AVITA Plus Assay for HLA-B locus (Park et al., 2014)

Reliability Verification Method of Analysis Results

The accuracy of HLA typing results is very important for clinical diagnosis treatment and scientific research. If the typing results are unreliable, organ transplantation matching may fail and the conclusions of disease association research will be wrong. Therefore, it is of great significance to establish a scientific reliability verification method, which will be discussed below.

  • Multi-tool cross-validation: At least two typing tools, such as OptiType and Kourami, are used to compare and analyze the data. If the consistency of the two tools is ≥95%, it means that the results are highly reliable. For inconsistent loci, it is necessary to carefully check the original data, or further verify by long reading and long sequencing to determine the correct typing results.
  • Verification of population genetics: The frequency of the target allele in the tested population was inquired by the Allele Frequency Net Database (AFND). If the frequency of an allele is less than 0.001 and there is no literature report, it is necessary to suspect that there may be a typing error or a new allele is found. In this case, it needs to be confirmed by Sanger sequencing and other methods to ensure the accuracy of typing results.
  • Family linkage analysis: in the study of genetic diseases, it is an effective method to verify the typing results by using the HLA haplotype linkage relationship of family samples. Because the inheritance of HLA haplotypes follows Mendel's genetic law, the HLA haplotypes of parents and children should conform to this law. If nonconformities are found, it is suggested that there may be problems such as typing errors or sample confusion, which need to be investigated and corrected in time.

The reduction of ambiguous allele combinations through the AVITA Plus Assay for the HLA-B locus (Park et al., 2014) Histogram of HLA-check distance (Jeanmougin et al., 2017)

Conclusion

The interpretation of HLA typing results has developed into an interdisciplinary system integrating molecular genetics, immunobiology, and clinical decision-making With the development of single-cell sequencing and AI-assisted typing algorithm, the future interpretation system will evolve towards automation and intelligence, but standardized quality control and multi-dimensional verification are still the cornerstones to ensure the reliability of the results. Researchers should continue to pay attention to the update of the IMGT/HLA database and international typing guidelines and establish a dynamic and optimized interpretation process in practice so that HLA typing data can truly become a bridge between gene polymorphism and clinical phenotype.

References

  1. Lee MY, You E. "The Importance of Confirming False Homozygosity in Pretransplant HLA Typing Results of Patients with Hematologic Malignancies." Int J Med Sci. 2024 21(13): 2430-2436 https://doi.org/10.7150/ijms.99883
  2. Xie M, Li J, Jiang T. "Accurate HLA type inference using a weighted similarity graph." BMC Bioinformatics. 2010 11 Suppl 11(Suppl 11): S10 https://doi.org/10.1186/1471-2105-11-s11-s10
  3. Ka S, Lee S., et al. "HLAscan: genotyping of the HLA region using next-generation sequencing data." BMC Bioinformatics. 2017 18(1): 258 https://doi.org/10.1186/s12859-017-1671-3
  4. Karnes JH, Bastarache L., et al. "Phenome-wide scanning identifies multiple diseases and disease severity phenotypes associated with HLA variants." Sci Transl Med. 2017 9(389): eaai8708 https://doi.org/10.1126/scitranslmed.aai8708
  5. Geffard E, Limou S., et al. "Easy-HLA: a validated web application suite to reveal the full details of HLA typing." Bioinformatics. 2020 36(7): 2157-2164 https://doi.org/10.1093/bioinformatics/btz875
  6. Park Y, Yoon CE., et al. "Resolution of ambiguous HLA genotyping in korean by multi-group-specific sequence-based typing." Yonsei Med J. 2014 55(4): 1005-13 https://doi.org/10.3349/ymj.2014.55.4.1005
  7. Jeanmougin M, Noirel J., et al. "HLA-check: evaluating HLA data from SNP information." BMC Bioinformatics. 2017 18(1): 334 https://doi.org/10.1186/s12859-017-1746-1
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.


Related Services
Inquiry
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.

CD Genomics is transforming biomedical potential into precision insights through seamless sequencing and advanced bioinformatics.

Copyright © CD Genomics. All Rights Reserved.
Top