CD Genomics-the genomics service company
Support Documents The CD Genomics Way of Thinking Explore the scientific documents we’ve developed, including sample submission guidelines, principles, applications, and bioinformatics of genetic technologies.
Home / Resource / Support Documents / Genome Research / Overview of PacBio SMRT sequencing: principles, workflow, and applications

Overview of PacBio SMRT sequencing: principles, workflow, and applications

PacBio’s SMRT (single molecule real time) sequencing is one of the most commonly used third-generation sequencing technologies. Compared with the previous two generations, PacBio long-read sequencing enabled by SMRT Sequencing technology requires no PCR amplification and the read length is 100 times longer than that of NGS.

PacBio SMRT sequencing applications

PacBio SMRT sequencing can be used for genomic de novo sequencing to get high quality genome sequences, obtaining full transcriptome information and detecting alternative splicing isoforms, diverse mutations in target regions, and epigenetic modifications and more.

The principle of PacBio SMRT sequencing

Zero-mode waveguides (ZMWs), subwavelength optical nanostructures fabricated in a thin metallic film, are powerful analytical tools that are capable of confining an excitation volume to the range of attoliters, which allows individual molecules to be isolated for optical analysis at physiologically relevant concentrations of fluorescently labeled biomolecules. Arrays of such nanostructures can also be engineered into systems for real-time analysis of a mass of single-molecule reactions or binding events, which is the principle of PacBio SMRT sequencing.

1-s2
Figure 2. A single SMRT Cell. Each SMRT Cell contains 150,000 ZMWs. Approximately 35,000-75,000 of these wells produce a read in a run lasting 0.5-4 h, resulting in 0.5-1 Gb of sequence.

PacBio SMRT Sequencing uses the innovation of ZMW to distinguish the ideal fluorescent signal from the strong fluorescent backgrounds caused by unincorporated free-floating nucleotides. The binding of a DNA polymerase and the template DNA strand is anchored to the bottom glass surface of a ZMW. Laser light travels through the bottom surface of a ZMW and not completely penetrates it, since the ZMW dimensions are smaller than the wavelength of the light. Therefore, it allows selective excitation and identification of light emitted from nucleotides recruited for base elongation.

Library construction

The workflow for library construction involves the following steps:

  • Determine the quality of genomic DNA (gDNA)
  • Shear gDNA using a g-TUBE (Covaris)
  • Select size and adjust concentration
  • Repair DNA damage and ends of fragmented DNA
  • Conduct DNA purification
  • Blunt-end ligation using blunt adapters
  • Purify template for submission to a sequencer

The template, called a SMRTbell, is a closed single-stranded circular DNA, which is created by ligating hairpin adapters to both ends of target double-stranded DNA (dsDNA) molecules.

temolate preparation
Figure 1. Template Preparation Workflow for PacBio RS II system.

Sequencing

As in Figure 3, a SMRTbell (grey) diffuses into a ZMW, and the adaptor binds to a polymerase immobilized at the bottom. Four types of nucleotides are labeled with a different fluorescent dye (indicated in red, yellow, green, and blue, respectively for G, C, T, and A) so that they have distinct emission spectrums. As a nucleotide is held in the detection volume by the polymerase, a light pulse that identifies the base is produced. (1) A fluorescently-labeled nucleotide binds to the template in the active site of the polymerase. (2) The fluorescence output of the color corresponding to the incorporated base (yellow for base C as an example shows here) is elevated. (3) The dye linker-pyrophosphate product is cleaved from the nucleotide and diffuses out of the ZMW to end the fluorescence pulse. (4) The polymerase is translocated to the next position. (5) The next nucleotide binds to the template in the active site of the polymerase and initiates the next fluorescence pulse, which corresponds to base A here.

sequencing
Figure 3. Sequencing via light pulses.

Bioinformatics Analysis

Bioinformatics analysis, such as de novo assembly, reference genome mapping, genome annotation (pathogenic and susceptibility genes prediction, non-coding RNA prediction, CRISPRs prediction), gene function annotation (COG/ GO/ KEGG), SNP/InDel identification and comparative genomics analysis, evolutionary analysis and estimation of divergence time are viable.

A comparison of RS II and Sequel sequencing platform

Third-generation sequencing has been widely used in genome research since the successful launch of commercial sequencing instrument PacBio RS II in 2013. After continuous improvement and upgrading, PacBio launched its new and upgraded third-generation sequencer PacBio Sequel sequencing system in October 2015. A comparison of RS II and Sequel sequencing platform is outlined below.

Table 1. The comparison of RS II and Sequel sequencing platform

  RS II Sequel
Average read length 10~15kb 8~12kb
ZMWs 150,000 1,000,000
Data size/SMRT Cell 500Mb~1Gb 5~10Gb
SMRT Cell No./Run 1~16 1~16
Run time/SMRT Cell 0.5~6 hours 0.5~6 hours
Multiplex Amplicons 384 1536

Sequel platform has great advantages over RS II platform, since it enables higher-throughput sequencing within a shorter timeline and at a lower cost.

Features of PacBio SMRT sequencing

  • Single-molecule resolution

PacBio SMRT sequencing requires no PCR amplification, can easily cover high-GC and high-repeat regions, and is more accurate in quantifying low-frequency mutation.

  • Long reads

PacBio SMRT sequencing provides very long reads. Average read length is 8-15kb and up to 40-70kb.

  • Speediness

PacBio SMRT sequencing is time-effective at the rate of 10 nt per second.

  • High accuracy

The rapid sequencing has also brought about some obvious drawbacks. For example, the relatively high error rate of PacBio SMRT sequencing (which is almost a common fault of current single-molecule sequencing technology) can reach 10%-15%. But unlike next-generation sequencing, the errors are random without bias. Therefore, the base deviation can be effectively corrected through multiple sequencing, and the consensus accuracy of PacBio SMRT sequencing can be greater than 99.999% (Q50).

  • Direct identification of base modification

The base modifications can be directly detected when the genome is sequenced.
CD Genomics can provide integrated PacBio SMRT sequencing services, including long-read metagenomic sequencing, bacterial whole genome de novo Sequencing, fungal whole genome de novo sequencing, full-Length transcripts sequencing (Iso-Seq), human whole genome PacBio SMRT sequencing, and full-Length 16S/18S/ITS amplicon sequencing. If you are interested in our services, please feel free to contact us.

References:

  1. Kong, N., Ng, W., Thao, K., Agulto, R., Weis, A., & Kim, K. S., et al. (2017) ‘Automation of pacbio smrtbell ngs library preparation for bacterial genome sequencing’, Standards in Genomic Sciences, 12(1), 27.
  2. Rhoads, A., & Au, K. F. (2015) ‘Pacbio sequencing and its applications’, Genomics,Proteomics & Bioinformatics, 13(5), 278-289.
  3. PacBio’s website.
SPEAK TO OUR SCIENTISTS

What would you like to discuss?

With whom will we be speaking?

Please input "genomics" as verification code.

* is a required item.

Get cutting-edge science information from CD Genomics sent straight to your inbox every month.

SUBSCRIBE TO OUR NEWSLETTER
CONTACT CD GENOMICS

45-1 Ramsey Road, Shirley, NY 11967, USA
Tel: 1-631-275-3058
Fax: 1-631-614-7828
Email: info@cd-genomics.com