Targeted sequencing (tNGS) involves the precise enrichment of specific regions or loci within the genome, followed by sequencing using next-generation methods such as next-generation sequencing. This includes techniques like whole exome sequencing, which focuses on the protein-coding regions of the genome, as well as customized sequencing panels tailored for the investigation of particular genes of interest.
Recommended reading: WGS vs. WES vs. Targeted Sequencing Panels.
CD Genomics short-read sequencing and long-read sequencing platforms facilitate the robust analysis of exomes and genomes. This advanced targeted sequencing approach allows for comprehensive and efficient examination of genetic material, providing valuable insights into the molecular landscape and potential biomarkers associated with various conditions.
The inception of gene sequencing in life sciences research traces back to pivotal moments in scientific history:
In 1977, Walter Gilbert and Frederick Sanger pioneered the first sequencer, employing chain-termination sequencing to decode the genome sequence of phage X174, spanning 5,375 bases. This breakthrough marked the formal initiation of gene sequencing in scientific exploration.
Building on this foundation, in 1988, Chambehian et al. introduced multiplex PCR technology, laying the groundwork for subsequent advancements in multiplex PCR amplicon sequencing methodologies.
A significant milestone in the evolution of targeted sequencing occurred in 2005, when Nature Methods published an article titled "Direct Genomic Selection." This approach utilized 150kb biotin-labeled BAC DNA for hybridization with human genomic DNA, followed by the capture of DNA fragments using streptavidin affinity beads. Subsequent PCR amplification facilitated sequencing, revealing that approximately 50% of the sequenced fragments originated from the target region.
Today, targeted sequencing predominantly employs two methodologies: targeted capture and multiplex PCR, also known as amplicon sequencing.
Whole-genome sequencing (WGS) entails sequencing all bases of the entire genome, providing a comprehensive profile of the genome's sequence. Its primary applications include genome assembly and the identification of various genomic variants, including structural variants.
Targeted Sequencing (also known as gene panel sequencing) selectively sequences specific genes, typically ranging from a few dozen to a thousand genes. Therefore, in terms of genome coverage, whole genome sequencing > whole exome sequencing > targeted sequencing.
Whole-Exome Sequencing can be viewed as a subset of targeted sequencing, focusing solely on sequencing all exons within the genome.
WGS offers the most extensive detection scope, encompassing coding and non-coding regions, as well as regulatory regions and structural variants. But compared to targeted sequencing, WGS does have some limitations.
In comparison, targeted sequencing allows for deep sequencing due to its smaller detection region (e.g., exons comprise only 1% of all human gene sequences). This enables the detection of low-frequency and rare variants while reducing costs and storage requirements. Therefore, targeted sequencing offers a more cost-effective solution, especially for research endeavors with limited funding and studies focused on coding protein variant diseases.
Methods of DNA-seq. (Bewicke-Copley et al., 2019)
Hybridization capture stands as a targeted sequencing marvel, seamlessly melding molecular hybridization with next-generation sequencing techniques. This sophisticated method hinges on the meticulous design and synthesis of probes tailored to the target genomic region. These probes, acting as molecular magnets, selectively bind to the desired fragments within the target region, while extraneous segments are swiftly removed.
Hybridization capture thus orchestrates a symphony of molecular interactions, culminating in the precise isolation and sequencing of target genomic regions.
Multiplex PCR targeted sequencing, also referred to as targeted amplicon sequencing, seamlessly integrates multiplex PCR technology with next-generation sequencing methods. This innovative approach enables simultaneous amplification of multiple target region sequences, yielding amplicon products. Subsequently, adapter sequences necessary for next-generation sequencing are introduced to both ends of the amplicon products, achieved through either PCR amplification or enzyme ligation reactions. This step transforms the amplicons into libraries ready for next-generation sequencing.
Following library preparation, next-generation sequencing is conducted, followed by an analysis of raw data. This comprehensive process yields sequence information specific to the target regions, fulfilling the primary objective of targeted sequencing.
Multiplex PCR targeted sequencing finds widespread application in various fields. For instance, in the realm of pathogen detection, this technique facilitates the analysis of community composition and distribution of pathogenic microorganisms. Such analyses play a crucial role in diagnosing clinical pathogenic infections, contributing significantly to disease management and treatment strategies.
Multiplex PCR targeted sequencing thus emerges as a versatile tool, facilitating precise detection and characterization of target genomic regions across diverse applications.
Table 1 Differences between WGS, hybridization capture NGS panels, and multiplex PCR NGS panels
Whole Genome Sequencing | Hybridization Capture (Exome Sequencing) | Multiplex PCR (Exome Sequencing) | |
Target Region Size | 3 G (human) | 50 M | 10 kb - 5 M (variable) |
Genome Coverage | 100% of the genome | 1.30% (variable) | <0.1% (variable) |
Library Construction Cost | $ | $$ | $$ (variable) |
Typical Sequencing Depth | 30x | 100x | 500-10,000x |
Sequencing Data Volume | 90 Gb | 5 Gb | 1 Gb (variable) |
Sequencing Cost | $$$ | $$ | $ (variable) |
Data Storage Cost | $$$ | $$$ | $ |
Difficulty in Analyzing Raw Data | High Complexity | Medium Complexity | Low Complexity |
The quality of data obtained from target gene region capture is primarily assessed through the following key indicators: target region coverage, capture efficiency, and homogeneity of target region coverage.
Evaluating these parameters ensures the reliability and effectiveness of targeted sequencing data, providing valuable insights into the intended genomic regions.
Targeted next generation sequencing sample and data processing workflow. (Gulilat et al., 2019)
Targeted sequencing serves as a valuable complement to whole genome sequencing, streamlining both experimental procedures and analytical objectives. Its rapid and effective nature has carved out a unique niche in next-generation high-throughput sequencing, with a growing array of application areas.
References: