CD Genomics Blog

Explore the blog we’ve developed, including genomic education, genomic technologies, genomic advances, and genomics news & views.

Emerging infectious diseases have the potential to impose enormous mortality, morbidity, and economic burdens on humans. Tracking the spread of infectious diseases to help control them has traditionally relied on the analysis of case data collected during the course of an epidemic or pandemic. Over the last few decades, there has been a few cases of virus disease outbreaks and the intensity of each varied depending on the ability of the virus to transmit among humans. Such incidents include recurrent outbreaks such as swine- and avian-origin influenza, Ebola, and Zika, as well as novel viruses, causing outbreaks such as SARS, MERS, and COVID-19. With the advances in sequencing technologies and phylogenetics, scientists are able to identify and characterize causative viruses, transmission chain tracking, and outbreak mapping much more rapidly and accurately.

Sanger Sequencing

Viral genome sequencing is an important method for medical diagnosis, outbreak investigation, and research on host-pathogen interactions. As a classic and highly accurate (99.99%) sequencing technology, Sanger sequencing is considered the "gold standard" for validating the sequence of specific genes. Moreover, Sanger sequencing is also the standard for clinical molecular diagnosis. It can provide great assistance in the field of infectious diseases, genetic diseases, and oncology. Over the years, Sanger sequencing technology and the data analysis methods have been greatly optimized. Thus, Sanger sequencing remains a useful tool for sequencing single genes or amplicon even with the blooming of the next-generation sequencing (NGS).

High-Throughput Sequencing

Whole-genome sequencing and the corresponding bioinformatics analyses are efficient and reliable tools for studying virus evolution, virulence factor changes, genetic variations, and for the development of new therapies. NGS has become the most popular approach for sequencing viral genomes, especially in situations where many genes must be sequenced simultaneously, in searching for novel gene variants, and in low-abundance samples.

Comparison of high-throughput sequencing methods in virology

  • Highly specific; correspondently decreases sequencing costs
  • Highly sensitive, with good coverage even at low pathogen load
  • Relatively straightforward design and application of new primers for novel sequences
  • Labor-intensive and difficult to scale for large genomes
  • High sample volume required for iterating standard PCRs across large genomes
  • PCR reactions are subject to primer mismatch, especially in poorly characterized or highly diverse pathogens, or pathogens with novel variants
  • Limited ability to sequence novel pathogens
  • High number of PCR cycles may introduce amplification mutations
  • Uneven amplification of different PCR amplicons
Hybrid capture-based sequencing
  • Suitable for high-throughput automation and large genome sequencing
  • Higher specificity than metagenomics; correspondently decreases sequencing costs
  • Overlapping probes increase the tolerance for individual primer mismatches
  • Fewer PCR cycles than PCR amplification, which reduces the introduction of amplification mutations
  • Preservation of minor variant frequencies
  • High cost and technical expertise for sample preparation
  • Unable to sequence novel pathogens and well-characterized reference genomes are required for probe design
  • Sensitivity is comparable to PCR, but coverage is proportional to pathogen load; low pathogen load yields low or incomplete coverage
  • Cost and time to generate new probe sets limit a rapid response to emerging and novel viruses
  • Simple, cost-effective sample preparation
  • Can sequence novel or poorly characterized genomes
  • Effective in identifying a potential underlying pathogen
  • Lower required number of PCR cycles causes few amplification mutations
  • Preservation of minor variant frequencies reflects in vivo variation
  • No primer or probe design required, which enables a rapid response to novel pathogens or sequence variants
  • Relatively high sequencing cost to sufficient data
  • Relatively low sensitivity to target pathogen
  • Coverage is proportional to viral load
  • High proportion of non-pathogen reads increases computational challenges
  • Incidental sequencing of human and off-target pathogens raises ethical and diagnostic issues


  1. Houldcroft CJ; et al. Clinical and biological insights from viral genome sequencing. Nature Reviews Microbiology. 2017, 15(3):183.

Leave a Reply

Your email address will not be published.