Microbial Whole Genome Sequencing

With decades of experience in the fields of genome sequencing, CD Genomics is devoted to providing the accurate and affordable microbial whole genome sequencing service. We combine both Illumina (short reads) and PacBio (long reads) platforms for microbial re-sequencing and complete genome de novo sequencing. Our strong expertise is enhanced by flexible sequencing strategies and professional bioinformatics pipelines.

The Introduction of Microbial Whole Genome Sequencing

There are great differences between microorganisms and higher eukaryotes. Besides their smaller genome, most bacteria have a single circular chromosome (sometimes more than one chromosomes, linear chromosomes, or combinations of linear and circular chromosomes). The genes in microbial genomes are usually a single continuous stretch of DNA, although several types of introns seldom exist in bacterial genome. The presence of plasmids in bacterial genome is another major difference. Plasmids, the circular extra-chromosomal DNA, can be transferred via horizontal DNA transfer, mediating the rapid evolution of microorganisms. Virus is a non-cellular infectious organism consisting of a core of DNA or RNA surrounded by a protein coat.

Microbial whole genome sequencing yields tons of data enabling a comprehensive evaluation of all genetic features of an isolated microorganism. Shotgun sequencing strategy is a primary method of microbial whole genome sequencing. The sequencing steps do not need labor-intensive mapping and cloning, which saves tremendous time and money. Furthermore, high-throughput sequencing allows us to sequence hundreds of bacteria or viruses at the same time with the power of multiplexing. In whole genome shotgun sequencing, the whole genome is broken up into small fragments for sequencing, and then assembled together by computational methods based on the overlapped regions, hence not requiring a reference genome. PacBio SMRT technology enables us to provide bacterial de novo whole genome sequencing and fungal de novo whole genome sequencing that generate more accurate and contiguous sequences.

Microbial whole genome sequencing is crucial for precise microbial identification, the generation of complete reference genomes (de novo sequencing), comparative genomic studies (re-sequencing), and genomic exploitation. Comparative genomic studies can identify individual genetic variations and large-scale structural variations within a population for which a reference genome is available. Evolutionary characteristics and phylogenetic relationships can be hence inferred. Microbial whole genome sequencing provides the possibility of gene finding and annotation. After multiple genes are explained, novel biochemical pathways that may be beneficial for medicine and biotechnology will likely be identified.

Advantages of Microbial Whole Genome Sequencing

  • A time-effective and cost-efficient approach
  • Broad applications: de novo sequencing, gene annotations, comparative genomic studies, evolutionary studies of microorganisms, etc.
  • Drug discovery and development: assess the contribution of DNA on pathogenesis, and understand the role of mobile elements in drug resistance and transmission

Microbial Whole Genome Sequencing Workflow CD Genomics can offer integrated genome sequencing services for bacteria, yeast, fungi, phage and virus. Our highly experienced expert team executes high quality management following every procedure to ensure confident and unbiased results with Illumina HiSeq and/or PacBio SMRT system. The general workflow for microbial whole genome sequencing is outlined below.

Service Specification

Sample requirements and preparation
  • Pure cultured microorganisms or extracted genomic DNA sample
  • The recommended DNA amount for submission is 2 µg or more with a concentration of ≥ 20 ng/µl. OD260/280=1.8~2.0
  • All DNA samples are validated for purity and quantity and subject to quality control prior to processing. Short-insert or long-insert libraries are then generated using TruSeq or Nextera protocols.
  • HiSeq platforms PE150, PacBio SMRT and MGI DNBSEQ-T7/DNBSEQ-G400
  • Sequencing coverage ≥ 100X
  • PacBio SMRT achieves ~10 kb average read lengths, with some as long as 60 kb.
Bioinformatics Analysis We provide customized bioinformatics analysis:
  • Raw data quality control
  • De novo assembly, reference genome mapping
  • Genome annotation (pathogenic and susceptibility genes prediction, non-coding RNA prediction, CRISPRs prediction)
  • Gene function annotation (COG/ GO/ KEGG)
  • SNP/InDel identification and comparative genomics analysis
  • Evolutionary analysis and divergence time estimation
  • More data mining upon your request
Analysis pipeline

CD Genomics provides full microbial whole genome sequencing service package including sample standardization, library construction, deep sequencing, raw data quality control, genome assembly, and downstream bioinformatics analysis. We can tailor this pipeline to your research interest. If you have additional requirements or questions, please feel free to contact us.


Fraser C. M., et al. Microbial genome sequencing. Nature, 2000, 406(6797): 799.

1. How to prepare samples?

You can submit pure cultured microorganisms or extracted genomic DNA samples for microbial whole genome sequencing. For cultured microorganisms, the purified cells should be spun down in a 1.5 mL tube not containing media. We will need a minimum of 1 million cells, and the more cells you can offer, the better. For genomic DNA samples, the recommended amount for submission is 2 µg or more with a concentration of more than 20 ng/µl. The ratio of OD260/OD280 is between 1.8 and 2.0. Samples should be shipped frozen on dry ice.

Faeces samples: store the faeces samples below -80℃. Empirically, fresh faeces contribute to isolating preferable DNA.

2. How to increase the accuracy of genome assembly when utilizing PacBio SMRT system?

The raw read error rate of PacBio SMRT system is substantially higher at around 14% compared with the 0.1 to 1% error rate of other leading sequencing systems. However, the error model is stochastic, so very high quality reads across all bases can be achieved in the consensus sequence. Additionally, the SMRT sequencing system is capable of sequencing regions of high GC content, leading to much more uniform coverage of the genome.

There are three ways to ensure the accuracy of genome assembly: (i) prior to assembly, correct sequences in the consensus sequence; (ii) correct the results of sequence assembly utilizing sequencing data; (iii) correct the results of sequence assembly utilizing high quality next generation sequencing data. After the three corrections, the accuracy of final sequence assembly can reach 99.99%.

3. How to achieve zero gap?

Currently, the complete sequence map of more than 90% bacterial strains can be constructed by making use of a combination of Illumina HiSeq and PacBio SMRT systems. The complete sequence map of the rest 10% bacterial strains can be achieved with Sanger sequencing data.

4. Can complete genome assembly be achieved even in the regions of high or low GC content, as well as repetitive sequences?

Pacbio RS II system can overcome the foregoing challenges. CD Genomics has completed hundreds of bacterial genome assembly cases without gap.

5. What are the advantages of microbial whole genome sequencing in clinical practice?

Whole genome sequencing has been a routine tool for clinical microbiology. It can reduce diagnostic time and improve control and treatment. It describes and improves our understanding of microbial evolution, outbreaks and transmission events. The conventional procedures for whole genome sequencing often include multiple cultivation and incubation steps followed by species identification, susceptibility testing and typing, which may take several weeks. A number of other approaches (such as PCR based methods) are cheap and rapid, but limited in the sensitivity. Although whole genome sequencing is still too expensive, the price and turnaround time will most likely fall in terms of the competition among sequencing platforms.

Hasman H., et al. Rapid whole genome sequencing for the detection and characterization of microorganisms directly from clinical samples. Journal of clinical microbiology, 2013: JCM. 02452-13.

The Genome Sequence of the Highly Acetic Acid-Tolerant Zygosaccharomyces bailii-Derived Interspecies Hybrid Strain ISA1307, Isolated from a Sparkling Wine Plant

Journal: DNA Research
Impact factor: 5.404
Published: June 2014


This work described the genome sequencing and annotation of the yeast strain ISA1307, isolated from a sparkling wine production plant. This strain, formerly considered as the Zygosaccharomyces bailii species, is an interspecies hybrid between Z. bailii and a closely related species.

Strains Cultivation

  • The prototrophic yeast isolates ISA1307
  • Z. bailii ATCC58445T
  • S. cerevisiae BY4714 and BY4743


  • Quantification of genomic DNA from ISA1307 and S. cerevisae
  • SYBR Green I-based staining protocol


  • Pulsed field gel electrophoresis (PFGE)


  • Whole genome shotgun sequencing at CD Genomics
  • Illumina, paired-end.
  • Genome assembly and annotations


1. The ISA1307 Hybrid strain

The comparison of sequences of the house-keeping genes (including RPB1,RPB2, EF1-α, and β-tublin) revealed that ISA1307 strain is an interspecies hybrid between Z. bailii and a closely related species.

2. Genome assembly

The quantification by flow cytometry revealed that the estimated size of ISA1307 is around 22.0 Mb with S. cerevisiae BY4741 and BY4743 used as a calibration curve (Figure 1A). PFGE profiling of the ISA1307 was performed, and 13 chromosomal bands were observed, with sizes ranging from 733 to 2120 Mb (Figure 1B).

Figure 1. Estimation of genome size and karyotyping of the ISA1307 strain. (A) Representative cell analysis histogram. (B) Karyotype of the reference strain Z. bailii ATCC58445 (lane 2) and of the ISA1307 strain (lane 1).

The final reconstructed genome of the ISA1307 strain is distributed over 154 scaffolds. The sum of all scaffolds size is 21241152 bp, which corresponds to 96% of the genome size that was estimated by flow cytometry (Table 1).

3. Annotation

A total of 9,925 genes are predicted to be encoded by the ISA1307 strain, including 4,385 duplicated gene and 1,155 single-copy genes. The predicted functions involve “metabolism and generation of energy”, “protein folding, modification and targeting”, and “biogenesis of cellular components”. The authors further studied the ISA1307 genes and proteins involved in the above functions.

Figure 2. Functional classes of genes predicted to be encoded by the genome of the ISA1307 strain.

Reference Mira N. P., et al. The genome sequence of the highly acetic acid-tolerant Zygosaccharomyces bailii-derived interspecies hybrid strain ISA1307, isolated from a sparkling wine plant. DNA Research, 2014, 21(3): 299-313.

For Research Use Only. Not for use in diagnostic procedures.
Featured Resources
PDF Download
* Email Address:

CD Genomics needs the contact information you provide to us in order to contact you about our products and services and other content that may be of interest to you. By clicking below, you consent to the storage and processing of the personal information submitted above by CD Genomcis to provide the content you have requested.

Related Services
Quote Request
! For research purposes only, not intended for personal diagnosis, clinical testing, or health assessment.
Contact CD Genomics
Terms & Conditions | Privacy Policy | Feedback   Copyright © CD Genomics. All rights reserved.