What Is DNA Sequencing?
DNA Sequencing is the method that determines the order of the four nucleotides bases (adenine, thymine, cytosine, and guanine) that make up the DNA molecule and convey important genetic information. In the DNA double helix, the four bases bond with the specific partner to form units called base pairs (bp). Adenine (A) pairs with thymine (T) and cytosine (C) pairs with guanine (G). The human genome contains around 3 billion base pairs that provide the instructions for the creation and maintenance of a human being. The based-paired structure makes DNA sequence well suited to the storage of a vast amount of genetic information. This complementary base-pairing is the basis for the mechanism by which DNA molecules are copied, transcribed and translated, and the pairing also underlies most of DNA sequencing methods. Thanks to the tremendous improvement in DNA sequencing technologies and methods, whole genome sequencing has become possible and affordable.
DNA Sequencing Methods
Sanger sequencing was discovered by English biochemist Frederick Sanger in the 1970s. Sanger method is a classical DNA sequencing method that utilizes fluorescent ddNTPs (dideoxynucleotides, N = A, T, G, or C) to prevent the addition of another nucleotide. You can view our article ‘Sanger Sequencing: Introduction, Principle, and Protocol‘ to learn more about this method.
Next-generation sequencing (NGS, also known as massively parallel sequencing) technologies have largely supplanted Sanger sequencing with advantages such as high throughput, cost efficiency, and rapidness. NGS can determine the order of millions of fragments simultaneously. NGS is a short-read sequencing that requires the construction of small fragment library, followed by deep sequencing, raw data preprocessing, DNA sequence alignment, assembly, annotation, and downstream analysis.
Emerging third-generation sequencing, also known as long-read sequencing including PacBio SMRT sequencing and Oxford nanopore sequencing, can examine billions of templates of DNA and RNA and simultaneously detect variable methylations without bias. Long-read methods can detect more variations, some of which cannot be observed with short-read sequencing alone.
Figure 1. The history of DNA sequencing technologies.
Applications of DNA Sequencing Technologies
DNA sequencing reveals the genetic information that is carried in a particular DNA segment, a whole genome or a complex microbiome. Scientists can use sequence information to determine which genes and regulatory instructions are contained in the DNA molecule. The DNA sequence can be screened for characteristic features of genes, such as open reading frames (ORFs) and CpG islands. Homologous DNA sequences from different organisms can be compared for evolutionary analysis between species or populations. Notably, DNA sequencing can reveal changes in a gene that may cause a disease.
DNA sequencing has been used in medicine including diagnosis and treatment of diseases and epidemiology studies. Sequencing has the power to revolutionize food safety and sustainable agriculture including animal, plant and public health, improving agriculture through effective plant and animal breeding and reducing the risks from disease outbreaks. Additionally, DNA sequencing can be used for protecting and improving the natural environment for both humans and wildlife.
- Mardis E R. DNA sequencing technologies: 2006–2016. Nature protocols, 2017, 12(2): 213.