Cluster Analysis


What is Cluster Analysis

Cluster analysis is a machine learning technique that involves grouping data points. Given a set of data points, a clustering algorithm can be used to classify each data point into a specific group in the image. In theory, the data points in the same group should have similar attributes and characteristics, while the attributes and characteristics of the data points in different groups should be highly different. Clustering is a method of unsupervised learning, and is a common technique used in multi-domain statistical data analysis. Cluster analysis is different from classification. Classification analysis is to divide elements into different types according to certain rules. The result of cluster analysis is unknown, different cluster analysis methods may get different classification results, or the same cluster analysis method but different variables analyzed will also get different cluster results. Commonly used clustering methods include: K-means clustering, hierarchical clustering, and second-order clustering, etc.

Cluster Analysis Process

Cluster analysis process - CD Genomics

Applications of Cluster Analysis in Biology

As an important and commonly used technique in data mining, cluster analysis methods are often used in various fields. In the field of life sciences, cluster analysis techniques are used to analyze biological data (such as sequencing data, experimental result data, statistical data, etc.). It can help us study the properties and functions of genes and proteins, and provide help for exploring the mysteries of biology.

  • Effectively classify different gene sequence sets and identify functional genes.
  • Clustering the physicochemical properties of the protein according to the sequence can predict its function.
  • Infer the classification of plants and animals, infer the phylogenetic tree of the species, and gain the understanding of the inherent structure of the population.

What We Offer

CD Genomics provides different types of cluster analysis services to help you cluster gene expression data, cluster protein sequences, and construct systems development of trees, so as to understand the functions of related proteins and genes, and interpret the biological significance of gene sequences.

K-means clustering: also known as fast clustering, it selects a batch of clustering center points according to a certain method, and allows the cases to gather to the nearest cluster center point to form an initial classification and then adjust the unreasonable classification according to the principle of closest distance until the classification is reasonable.


Hierarchical clustering: also called systematic clustering. First, the cases (or variables) participating in the clustering are regarded as one category, and then gradually merged according to the clustering or similarity between the two categories, until all cases (or variables) are merged into one big category. In fact, the hierarchical clustering analysis results show the clustering process and classification results of each case. After hierarchical clustering, a cross-tabulation should be made to understand the characteristics of each category through the mean value of each category.


For clustering analysis, in addition to providing heat map charts, bar charts and pie charts, we also provide other cutting-edge and beautiful display methods. We provide high-quality cluster analysis result chart, which allows you to quickly understand the clusters of proteins or genes and meet your needs for publishing articles. CD Genomics provides one-stop, mature, cost-effective and fast turnaround clustering analysis services to speed up your research.

Advantages of CD Genomics

  • A professional data analysis team with rich project experience.
  • Help customers conduct professional data analysis, evaluation and filtering data, formulate the best analysis plan, and perform data analysis.
  • According to the analysis needs of customers, develop a personalized analysis plan, or develop a personalized chart summarizing the result.
  • Provide a complete interactive data analysis report, including all analysis methods and results.
  • Follow-up after report: We provide professional interpretation service of analysis report and biological interpretation of analysis results.
  • Fast turnaround time: thereby speeding up your research process.

Our Service Process

  • Data transmission

  • Analysis plan

  • Data analysis

  • Provide analysis report

  • After-sales Q&A

Biomedical-Bioinformatics, a division of CD Genomics, provides a one-stop clustering analysis service according to customers' requirements. If you don't have the data for clustering analysis, CD Genomics can also provide you with different types of sequencing services or download related data from existing open databases. If you have any questions about the data analysis content, turnaround time and price, please feel free to contact us. We have a professional technical support team to provide you with the best services, and we look forward to working with you!

* For research use only. Not for use in clinical diagnosis or treatment of humans or animals.

Online Inquiry

Please submit a detailed description of your project. Our industry-leading scientists will review the information provided as soon as possible. You can also send emails directly to for inquiries.