Principal Component Analysis


CD Genomics provides a one-stop data analysis service for doctors or scientific researchers with years of experience in data statistical analysis. The principal component analysis provided by CD Genomics can analyze the main influencing factors from the multiple influencing factors and reveal the essence of things. For example, through principal component analysis, the main influencing factors related to the occurrence and development of the disease are found.


Principal Component Analysis

In medical statistics, especially in clinical experiments, the observation results recorded by each observation object contain multiple response variables. For example, blood records include systolic blood pressure, diastolic blood pressure, pulse pressure, etc. Such data with multiple variables is called multivariate data. Principal component analysis is an analysis method in multivariate analysis methods. Common multivariate analysis methods also include multivariate analysis of variance (MANOVA), factor analysis, canonical correlation analysis, and cluster analysis, etc. Principal Component Analysis (PCA) is a statistical analysis method for mastering the main contradictions of things. It can analyze the main influencing factors from multiple things, reveal the essence of things, and simplify complex problems. The purpose of calculating principal components is to project high-dimensional data into a lower-dimensional space.

The Relationship Between Principal Components and Original Variables

Principal Component Analysis (PCA) transforms the original variables into a linear combination (principal components) of the original variables, while preserving the main information, to achieve the purpose of simplification and dimensionality reduction. The relationship between the principal components and the original variables mainly includes:

  1. The principal component is a linear combination of the original variables.
  2. The number of principal components is less than the original number.
  3. The principal components retain most of the information of the original variables.
  4. The main components are independent of each other.

Advantages of Principal Component Analysis

  • The raw data is not required to be normally distributed. The principal component is to rotate the basis set in the direction of the largest degree of data dispersion, and this feature expands its application range.
  • By synthesizing and simplifying the original variables, the weight of each indicator can be determined objectively, avoiding the arbitrariness of subjective judgment.

Principal Component Analysis Process

Principal-Component-Analysis-picture-2.pngFig 1. Flow chart of principal component analysis.

  1. First perform correlation tests, such as KMO test and Bartlett's test, to determine whether the data is suitable for principal component analysis.
  2. Select initial variables, unify the dimensions, and standardize the data.
  3. Choose whether to use the covariance matrix or the correlation matrix to find the principal components according to the characteristics of the initial variables.
  4. Calculate the eigenvalues and eigenvectors of the covariance matrix or correlation matrix.
  5. Determine the number of principal components and extract the principal components.
  6. To explain the principal components, the significance of the principal components is determined by several indicators with larger weights in each linear combination.

What We Offer

In order to fully cover the data analysis needs of clinicians or scientific researchers, CD Genomics provides one-stop principal component analysis services, and will provide appropriate analysis matrix (such as covariance matrix and the correlation matrix) based on the researchers' data and sample types in order to obtain more reasonable classification results. In addition to providing principal component analysis, we also provide data analysis services such as cluster analysis for multivariate data, comparative analysis of measurement data, regression analysis, and correlation analysis. For our data analysis services, if you have any questions, please feel free to contact our professional technical support. We are always ready to provide you with satisfactory services.

Advantages of CD Genomics

Advantages of CD Genomics

* For research use only. Not for use in clinical diagnosis or treatment of humans or animals.

Online Inquiry

Please submit a detailed description of your project. Our industry-leading scientists will review the information provided as soon as possible. You can also send emails directly to for inquiries.