Rare Disease Research with Whole Exome Sequencing

Rare diseases are those with extremely low incidence rates. There are approximately 7,000 rare diseases worldwide, many of which are genetic. Researching these diseases faces significant challenges due to their diverse symptoms, complex pathogenic mechanisms, and lack of effective treatments. In recent years, whole-exome sequencing (WES), a high-throughput genome sequencing technology, has become an important tool for rare disease research. This article will explore the applications and advantages of whole-exome sequencing in rare disease research.

I. Technical Background and Core Advantages

WES rapidly identifies pathogenic mutations by targeting approximately 1% of the exon regions (DNA fragments encoding proteins) in the genome using high-throughput sequencing technology. Compared to whole genome sequencing (WGS), WES is less expensive (approximately 1/10 the cost of WGS), offers more efficient data analysis, and covers over 85% of known pathogenic mutations. Its core advantages are:

  • Highly efficient location of pathogenic genes: By comparing exon variations between patients and healthy individuals, combined with bioinformatics filtering (such as excluding common variations and screening for functionally damaging mutations), genes associated with rare diseases can be quickly identified. For example, the pathogenic gene DHODH for Miller syndrome was discovered in four patients using WES.
  • Suitable for small sample studies: Traditional linkage analysis requires a large number of family samples, while WES only requires a few patients to discover new pathogenic genes, significantly lowering the research threshold.
  • Support for multi-dimensional analysis: Combining transcriptome and epigenomic data can reveal complex regulatory mechanisms (such as the impact of non-coding region variations on splicing).

II. Clinical Application Cases

Case 1: WES Analysis of 58 Critically Ill Children

1. Core Technology: Combined Detection of CNVs and SNVs

WES not only detects SNVs but also identifies CNVs using bioinformatics tools, achieving comprehensive capture of pathogenic variants in rare diseases:

  • CNV Detection: WES identified two novel pathogenic CNVs (7q36.3 deletion + 18q12.1q23 duplication in patient P9 and 20q13.13 deletion in patient P10). CMA validation confirmed the involvement of ClinGen haploinsufficiency genes (such as SHH and ADNP), explaining phenotypes such as micrognathia, developmental delay, and hypertelorism, none of which had been previously reported.
  • SNV Detection: Pathogenic/probable pathogenic (P/LP) SNVs were found in 25 out of 58 patients (43.1%), involving 24 genes (including 23 novel SNVs), including compound heterozygotes (11 cases), homozygotes (2 cases), and X-linked recessive variants (1 case). For example:
    • ACAT1 gene complex heterozygous variants (P14/P15) cause severe metabolic acidosis;
    • COG4 gene complex heterozygous variants (P21) cause congenital glycosylation disorder (type IIj);
    • TOR1A gene pathogenic variants (P8) are associated with brain tumors and respiratory failure.

2. Diagnostic Output: Improving the Diagnostic Rate and Phenotypic Expansion of Rare Diseases

  • Overall Diagnostic Rate: Of 58 patients, 27 (56.9%) had genetic findings, and 25 (43.1%) received a definitive genetic diagnosis, with a cumulative diagnostic rate significantly higher than traditional methods.
  • Phenotypic Explanatory Power: In 72.7% of patients with genetic findings (24/33), the genotype could explain the main phenotype (e.g., the COG4 variant in P21 explains jaundice and epilepsy; the ACAT1 variant in P15 explains metabolic acidosis), while only 6 cases (18.2%) had phenotypes partially explained by the genotype (mostly superimposed with severe infections).
  • Disease Distribution: Involves rare diseases across multiple systems, including metabolic disorders (ACAT1, PNPO), neurodevelopmental disorders (COG4, WDR73), immunodeficiency (SH3KBP1), and skeletal abnormalities (IQCE). Severe infections (pneumonia, sepsis, brain infection) are common accompanying symptoms (70.4% of patients with genetic diseases have concurrent infections, with a mortality rate of 68.4%).

3. Core Value: Complex Genotype Analysis and New Gene Discovery

  • Complex Genotype Identification: WES corrects misjudgments in traditional SNV analysis. For example, patient P8, based on a large X chromosome deletion (almost extending the p/q arm) and SNV analysis, was presumed to have Turner syndrome (45,X), and a pathogenic TOR1A variant was discovered. Eleven AR gene patients were identified as pseudohomozygous through CNV+SNV complex heterozygosity analysis (e.g., PNPO gene SNV+ duplication causing epilepsy).
  • Novel variants and phenotypic expansion: 23 novel SNVs (15 genes) and 2 novel CNVs were discovered, expanding gene-phenotype associations for 19 rare diseases. For example:
    • SH3KBP1 gene duplication (P9) suggests an association between X-linked immunodeficiency-61 and neurodevelopmental abnormalities;
    • Chromosome 4 deletion (P11) supports the role of the 4p14-p13 region in neurodevelopment (Liu J et al., 2021).

Figure 1.Flow diagram of patients' selection and genetic findings in the study.Flow diagram of patients' selection and genetic findings in the study (Liu J et al., 2021)

Case 2: Application of WES in Hypophosphatasia

(1) Diagnosis and Classification of HPP

Direct Detection of ALPL Gene Variations: In three patients (Case 1, 2, and 3), WES verified heterozygous ALPL gene variations (e.g., c.1447G>A/p.Val483Met, c.205G>A) discovered by Sanger sequencing. Combined with phenotypes such as low TNSALP activity (TSALP) and skeletal malformations, HPP (infantile or mild) was diagnosed.

(2) Identification of Complex Genetic Diseases

  • Discovery of Cosegregating Variations: In Case 1, WES discovered a heterozygous SMC1A gene variation (c.3470del) based on ALPL variations, explaining the patient's overlapping phenotypes such as epilepsy and developmental delay (HPP and developmental epilepsy 85 cosegregated); in Case 2, WES discovered a homozygous WNT10A gene variation (c.682T>A), suggesting a potential association between ectodermal developmental abnormalities and the HPP phenotype.
  • Correcting misdiagnosis of a single gene: In Case 6, the patient was initially suspected of having HPP (low TSALP). WES did not find any ALPL variants, but detected a homozygous variant of the SLC5A1 gene (c.1946G>A), confirming glucose-galactose malabsorption (GGM). Symptoms improved through dietary intervention (removing lactose/glucose/sucrose), avoiding ineffective treatment for HPP.

(3) Differential diagnosis in the absence of ALPL variants

  • Turning to other pathogenic genes: In Case 4 (no ALPL variant detected), WES found heterozygous variants (VUS) of the TRIO (c.5651A>C) and TRPV4 (c.880T>G) genes, suggesting a potential link between neurodevelopment and abnormal bone metabolism; in Case 5, TTN gene variants (c.32078-1G>T, c.47720_47721del) were detected, pointing to Salih's myopathy (congenital myopathy type 5).
  • Clinical value of negative results: In Case 7, WES did not detect ALPL and other gene variations. Combined with in-depth phenotypic analysis (facial features, intrauterine history), the diagnosis of fetal alcohol syndrome was confirmed, clarifying the non-hereditary cause and avoiding over-testing (Glotov OS et al., 2024).

Figure 2.Time-dependent dynamics of amino acid residue fluctuations for wild-type and Phe228Ile -mutated WNT10A proteins.Time-dependent dynamics of amino acid residue fluctuations for wild-type and Phe228Ile -mutated WNT10A proteins (Glotov OS et al., 2024)

Applications of Whole Exome Sequencing (WES) in Extremely Rare Hereditary Kidney Diseases

1. Overcoming the limitations of gene panel analysis to detect extremely rare diseases

Targeted gene panels only cover 20-100 known hereditary kidney disease genes, while WES covers the entire exon, enabling the detection of extremely rare diseases not included in panels (such as renal syndrome in families 5-7 and 9, infantile hypercalcemia type 1, etc.), avoiding prolonged diagnostic time and costs due to panel omissions.

2. Identifying unexpected pathogenic variants and validating the "genotype-phenotype" association

WES has diagnosed unexpected diseases (such as Ayme-Gripp syndrome caused by the MAF mutation in family 3, and short-ribbed thoracic dysplasia type 9 caused by the IFT140 mutation in family 4). Through the genotype-phenotype paradigm of "clinical information + genome sequencing," it verifies the full responsibility of the variant for the phenotype (e.g., MAF mutation explains facial deformities, hearing loss, and kidney involvement; IFT140 mutation explains skeletal dysplasia and early-onset ESRD).

3. Guiding Precision Intervention and Prognostic Prediction

  • Pathophysiological Interpretation: For example, in family 2, APRT deficiency (homozygous nonsense variant) clearly indicates excessive 2,8-dihydroxyadenine production leading to kidney stones, requiring xanthine oxidase inhibitors and a urea-restricted diet;
  • Treatment Adjustment: In family 8, the TRPM6 complex heterozygous mutation causes hypomagnesemia, requiring adjustment of calcium and magnesium supplementation (discontinuing calcium and increasing magnesium);
  • Transplantation Decision: In family 1, ADTKD-UMOD (UMOD missense variant), kidney transplantation from an unaffected offspring is selected;
  • Prognostic Monitoring: In family 4, SRTD 9 (IFT140 variant) requires regular monitoring of liver function, bone progression, and retinal detachment.

Figure 3.Based on whole-exome sequencing and pathogenicity, the genealogies of nine families were presented.Based on whole-exome sequencing and pathogenicity, the genealogies of nine families were presented (Jung J et al., 2021)

III. Technical Challenges and Solutions

Data Analysis Complexity

  • Challenge: WES data requires processing millions of variants, making it difficult to distinguish between pathogenic and benign variants.
  • Solution:
    • Machine Learning Assistance: Tools such as FABRIC GEM, by integrating multi-omics data, improve candidate gene screening efficiency to an average of 2 genes/case.
    • Functional Validation: Combining animal models (e.g., CRISPR editing) and organoid experiments to validate the impact of mutations on protein function.

Clinical Translation Bottlenecks

  • Challenge: The pathogenicity of some variants is still unclear, and phenotypic heterogeneity must be considered.
  • Solution:
    • Standardized Annotation: Referring to ACMG guidelines, combining population databases (e.g., gnomAD) and functional prediction tools (e.g., PolyPhen-2) for hierarchical assessment.
    • Multi-center Collaboration: Establishing rare disease databases (e.g., the "Huabiao Project") to share variant frequency and clinical phenotypic data.

Ethical and Privacy Issues

  • Challenge: How to provide feedback to patients on secondary findings (e.g., carrier status in adults).
  • Solution: Develop ethical guidelines, clarify informed consent before testing, and provide genetic counseling support.

IV. Future Development Directions

  • Technology Integration: Combining long-read sequencing (such as PacBio) to analyze complex structural variations and improve diagnostic accuracy.
  • Dynamic Monitoring: Tracking disease progression through liquid biopsies to guide personalized treatment.
  • Drug Development: Designing small molecule inhibitors or gene therapies based on pathogenic gene targets, such as metabolic interventions targeting DHODH.

V. Conclusion

Whole-exome sequencing has become a "core tool" in rare disease research, playing a crucial role in diagnosis, treatment, and basic research due to its high efficiency and accuracy. With decreasing technology costs and multi-omics integration, it is expected to achieve "one test for multiple uses" in the future, propelling rare diseases from "incurable" to "precision intervention."

References

  1. Liu J, Zheng Y, Huang J, Zhu D, Zang P, Luo Z, Yang Y, Peng Y, Xiao Z, Zhu Y, Lu X. Expanding the genotypes and phenotypes for 19 rare diseases by exome sequencing performed in pediatric intensive care unit. Hum Mutat. 2021 Nov;42(11):1443-1460.
  2. Glotov OS, Zhuchenko NA, Balashova MS, Raspopova AN, Tsai VV, Chernov AN, Chuiko IV, Danilov LG, Morozova LD, Glotov AS. The Benefits of Whole-Exome Sequencing in the Differential Diagnosis of Hypophosphatasia. Int J Mol Sci. 2024 Oct 31;25(21):11728.
  3. Jung J, Lee JH, Park YS, Seo GH, Keum C, Kang HG, Lee H, Lee SK, Lee ST, Cho H, Lee BH. Ultra-rare renal diseases diagnosed with whole-exome sequencing: Utility in diagnosis and management. BMC Med Genomics. 2021 Jul 3;14(1):177.
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
Related Services
PDF Download
* Email Address:

CD Genomics needs the contact information you provide to us in order to contact you about our products and services and other content that may be of interest to you. By clicking below, you consent to the storage and processing of the personal information submitted above by CD Genomcis to provide the content you have requested.

×
Quote Request
! For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.
Contact CD Genomics
Terms & Conditions | Privacy Policy | Feedback   Copyright © CD Genomics. All rights reserved.
Top