Bacterial Whole Genome de novo Sequencing

CD Genomics is providing PacBio Single Molecular Real-Time (SMRT) sequencing to increase your research method for bacterial whole genome sequencing. A comprehensive view of the bacterial genome, including genes, regulatory regions, IS elements, phage integration sites, and base modifications is vital to understanding key traits such as antibiotic resistance, virulence, and metabolism.

SMRT Sequencing produces long reads (average >15,000 bp, some reads >100,000 bp) and the highest consensus accuracy. It is especially helpful for genome de novo assembly. As we all know, repetitive stretches of DNA are abundant and are one of the main technical challenges that hinder accurate sequencing and genome assembly efforts. In the case of bacteria, the rRNA gene operon is often the largest region of repetitive sequence and range in size between 5 and 7 kb. Microbial whole genome sequencing by Illumina HiSeq platforms utilizes sequencing by synthesis technology, that is limited by its read length, currently ranging from 50 to 300 bp, and as it requires PCR amplification of multiple DNA templates before sequencing, there is potential for base-composition bias which may bias the G+C content of the sequences.

The advantage of SMRT long reads, which can overcome the problem of abnormal GC and high duplication of bacterial genomes, often assembled into a single contig. Among all assemblies, the PacBio assembly recovered the highest number of core and virulence proteins, and housekeeping genes based on whole-genome multilocus sequence typing. It is easier to achieve complete assembly or fine map, and help you rapidly advance your research to explore the genetic structure and functions.

Project Workflow

Our highly experienced expert team executes quality management by following every procedure to ensure comprehensive and accurate results. The general workflow for bacterial whole genome sequencing is outlined below.

  • Sample Amount Recommendation

1. DNA amount: ≥ 5 μg
2. DNA Purity: OD260/280 =1.8 ~2.0 without degradation and RNA contamination

  • Sequencing Strategy: 2-20 kb Library, ≥ 60X genome coverage depth
Analysis pipeline

At CD Genomics, we are using the long-read PacBio Sequel platform to support researchers all over the word with bacterial de novo whole genome sequencing needs. Our bioinformatics analysis include: genome assembly and polishing, gene prediction, genome annotation, and comparative species genomes analysis. As one of the most famous industry leaders who are skilled in sequencing, CD Genomics not only concentrates on developing cutting-edge technology but also be willing to share our state-of-the-art platforms and sufficient expertise with our clients to promote their brilliant studies. It is our guarantee to provide customers with the best services and finest results at CD Genomics.

1. What indicators can be used to evaluate bacterial genome assembly?

The common indicators for the quality of genome assembly include scaffold N50, N%, scaffold numbers, and the total number of base pairs.

2. How to achieve zero gap?

Currently, the complete sequence map of more than 90% bacterial strains can be constructed by making use of a combination of Illumina HiSeq and PacBio SMRT systems. Pacbio RS II system can achieve complete genome assembly even in the regions of high or low GC content, as well as repetitive sequences. The complete sequence map of the rest 10% bacterial strains can be achieved with Sanger sequencing data. CD Genomics has completed hundreds of bacterial genome assembly cases without gap.

PacBio But Not Illumina Technology Can Achieve Fast, Accurate and Complete Closure of the High GC, Complex Burkholderia pseudomallei Two-Chromosome Genome


Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published.


A closed, high-quality genome sequence for C. autoethanogenum  DSM10061 was generated using only using PacBio sequencing to achieve "0 Gap" assembly and without the need for manual finishing. But there are still many gaps in the genome obtained using Illumina and 454 sequencing platforms. C.autoethanogenum and C. ljungdahliiare were indistinguishable at the 16S rRNA gene level and had high scores for similarity. Through whole-genome sequencing, it was found that there were significant differences in CRISPR system, hydrogenase and other aspects between the two, which were difficult to detect through second-generation sequencing.


Table 1 Assembly statistics for strain DSM 10061 Figure 1. Comparison of DSM10061 genome assemblies.


Brown,S.D.; et al. Comparison of single-molecule sequencing and hybrid approaches for finishing the genome of Clostridium autoethanogenum, and analysis of CRISPR systems in industrial relevant Clostridia. Biotechnology for Biofuels. 2014, 7(2): 27-27.

For Research Use Only. Not for use in diagnostic procedures.
Related Services
Quote Request
! For research purposes only, not intended for personal diagnosis, clinical testing, or health assessment.
Contact CD Genomics
Terms & Conditions | Privacy Policy | Feedback   Copyright © CD Genomics. All rights reserved.