WGS vs. WES vs. Targeted Sequencing Panels

What are Exons?

Exons represent the segments of the genome capable of transcribing mature RNA. The entirety of exons within a genome collectively forms what is known as an exome. It is crucial to distinguish that the term 'whole-exome sequencing' specifically targets exons of protein-coding genes, with minimal involvement of non-coding genes.

The term 'gene' encompasses a sequence of nucleotides in DNA carrying specific genetic information. Genes are fragments of the DNA molecule with hereditary effects, serving as the fundamental units of inheritance that regulate biological traits. Human gene intervals vary in size, ranging from a few hundred base pairs (bp) to over 2 million bp. The Human Genome Project estimates that humans possess 20,000-25,000 protein-coding genes.

Recommended: What is the Human Genome Project (HGP)?

The 'genome' encompasses all the genetic information within an organism's DNA. It consists of gene regions and non-coding regions. The human genome's size is approximately 3 billion base pairs (bp) (3GB), with non-coding regions constituting the majority, while protein-coding regions make up only about 2%.

The exome constitutes the entirety of exons in the genome. Human exons number around 180,000, representing about 1% of the human genome, equivalent to approximately 30 million base pairs (30 MB).

Please read our article: Exome Sequencing Q&A for more information.

An important aspect to consider regarding exons involves the untranslated region (UTR). This includes both 5'UTRs (leading sequences) and 3'UTRs (tailing sequences) situated on either side of the mRNA. Their primary function is to regulate the initiation and termination of translation, respectively. While these UTRs comprise exonic sequences, it's crucial to note that they do not undergo translation into amino acids. Consequently, it's essential to recognize that not every exonic sequence translates into amino acids.

Please read our article The Methods of Whole Genome Sequencing for more details about WGS.

CD Genomics high-throughput sequencing and long-read sequencing platforms facilitate the robust analysis of exomes and genomes. This advanced sequencing approach allows for comprehensive and efficient examination of genetic material, providing valuable insights into the molecular landscape and potential biomarkers associated with various conditions.

What is Whole Exome Sequencing?

Whole-exome sequencing (WES), also referred to as Exome Sequencing, Whole Exome Sequencing, and similar variations, is a sequencing method specifically designed to analyze the exome, encompassing all exons within the genome.

In contrast, whole-genome sequencing (WGS) involves sequencing the entire genome, providing a comprehensive view of an organism's genetic makeup. On the other hand, Targeted Sequencing, also known as panel sequencing, focuses on sequencing selected genes, typically ranging from a few dozen to a thousand genes. Panel sequencing operates on two technical principles: hybridization capture sequencing and multiplex amplicon sequencing. The all-encompassing approach is achieved through the utilization of sequence hybridization principles.

Recommended article: Whole Exome Sequencing Based on Hybridization Capture Protocol

Therefore, in terms of genome coverage, the hierarchy is as follows: whole genome sequencing > whole exome sequencing > targeted sequencing.

It's important to recognize that whole-exome sequencing can be considered a specialized form of targeted sequencing, as its focus is directed towards the entirety of exonic regions in the genome.

Table 1 Differences between WGS, WES and targeted NGS panels

WGS WES Panel
Sequencing Region Whole genome Whole exome Selected regions
Region Size 3 G > 30 M Tens to thousands of genes
Sequencing Depth > 30X 50-150X > 500X
Data > 90 G 5-10 G -
Detectable Variant Types SNPs
InDels
CNV
Fusion
SV
SNPs
InDels
CNV
Fusion
SNPs
InDels
CNV
Fusion

Recommended article: How to Decide Between 100X Whole Exome Sequencing (WES) and 30X Whole Genome Sequencing (WGS)?

Workflow of Whole Exome Sequencing

The Whole Exome Sequencing (WES) workflow can be broadly categorized into three main stages: library preparation, sequencing, and bioinformatics analysis.

Library Preparation

  • Sample Processing: Initial handling of samples to extract DNA.
  • DNA Extraction: Isolation of DNA from the processed samples.
  • Quantification: Measurement of DNA concentration to ensure an adequate starting amount.
  • Library Construction: Preparation of DNA libraries for sequencing.
  • Hybridization Capture: Enrichment of target exonic regions through hybridization.
  • Amplification: Replication of DNA fragments for increased sequencing sensitivity.
  • Quality Control: Assessment of library quality to ensure optimal sequencing conditions.

Sequencing

Utilization of sequencing platforms, including foreign platforms such as Illumina and domestic platforms like those manufactured by UWI.

Bioinformatics Analysis

  • Quality Control: Evaluation of the sequencing data for reliability.
  • Splicing and Matching: Alignment of reads to the reference genome.
  • De-duplication and Re-arrangement: Removal of duplicate reads and arrangement of data.
  • Mutation Detection: Identification of genetic variations and mutations.
  • Noise Reduction and Filtering: Application of filters to minimize background noise.
  • Annotation: Addition of functional information to identified variants.
  • Commonly Used Software: FastQC, BWA, GATK, ANNOVAR, among others.

Please read our article: Bioinformatics Workflow of Whole Exome Sequencing.

How to Evaluate Targeted Exome Sequencing Panel Probes?

When assessing the performance of an Exome Panel Capture Probe, several criteria play a crucial role. Whether opting for off-the-shelf probes or considering customization, careful consideration is essential. Here are key aspects to evaluate:

Probe Evaluation Criteria

  • Specificity: Precision in capturing the intended genomic regions without off-target effects.
  • Sensitivity: Ability to detect and capture the target regions effectively.
  • Uniformity: Consistent coverage across targeted regions without significant bias.
  • Reproducibility: Reliable performance across multiple experiments and replicates.

Additional Considerations for Off-The-Shelf Probes

  • Probe Size, Length, and Design: Compatibility with sample and research requirements.
  • Alignment with Research Goals: Ensuring the probe set aligns with the objectives of the study.

Customization Options

  • Tailoring to Specific Needs: Adding areas of interest or designing a completely custom panel.
  • Reference Databases: Utilizing databases such as RefSeq, CCDS, Ensembl, GENCODE, and ClinVar for probe design.
  • Probe Density: Requesting additional probe densities for specific regions to enhance capture efficiency.

Special Features in External Probes

  • SNP Sites for Sample Identification: Some probes incorporate SNP sites for sample tracking, minimizing contamination risks in NGS experiments.
  • Automation for Sample Tracking: Utilizing SNP IDs to automate sample identification, reducing human error in comparison to manual marker addition or Index sequencing labels.
  • Sequencing Depth: Ensuring sufficient depth for accurate SNP-based sample tracking.

Commonly Used Metrics for WES Probes Evaluation

5 Tips for Assessing Hybridization Capture Probes

Evaluation of WES Probes commonly involves the following metrics. Of course, given that WES is a specific type of targeted sequencing, these metrics can also be used to assess hybridization capture probes used in other targeted sequencing.

On-Target Rate

The on-target rate is a crucial percentage indicating the extent to which sequencing data aligns with the target region. While exons are the primary focus, many genomic areas, such as introns and intergenic regions, share homology with exons. In practice, non-target (exon) regions captured during hybridization are considered off-target. Off-target data is deemed invalid and cannot be utilized in subsequent analyses, representing a waste of sequencing resources. A higher on-target rate and reduced off-target waste signify a more efficient probe.

Coverage

Coverage, often paired with depth (e.g., "10X coverage" or "30X coverage"), denotes the extent to which sequencing reads cover a given region. For instance, "10X coverage of 90%" indicates that 90% of the sequencing data covers the target region at least 10 times. If coverage is not specified with depth, it is interpreted as "1X coverage," implying that the region is covered by at least one read. Higher coverage and lower missed target percentages enhance the effectiveness of the probe.

Homogeneity

Homogeneity assesses the evenness of coverage across different sites within the target region. Ideal uniformity ensures that the depth at each site closely aligns with the average depth. Fold-80, a metric evaluating homogeneity, represents the additional sequencing required to ensure 80% of target bases reach the average depth. A lower Fold-80 signifies efficient capture, minimizing wasteful sequencing. Probes with excellent homogeneity contribute to cost-efficient and effective sequencing.

Duplication Rate

Duplication rate reflects the percentage of duplicate reads in the total sequenced sequence. Duplicate reads, devoid of additional information, are removed in downstream analysis to enhance mutation detection accuracy. A higher duplication rate reduces data utilization, leading to wasted sequencing costs. Lower duplication rates, in the same context, result in cost savings, indicating the efficiency of the probe.

For Research Use Only. Not for use in diagnostic procedures.
Related Services
Quote Request
! For research purposes only, not intended for personal diagnosis, clinical testing, or health assessment.
Contact CD Genomics
Terms & Conditions | Privacy Policy | Feedback   Copyright © CD Genomics. All rights reserved.
Top