We use cookies to understand how you use our site and to improve the overall user experience. This includes personalizing content and advertising. Read our Privacy Policy
Microsatellites, also known as short tandem repeats (STRs) or simple sequence repeats (SSRs), exist widely in the genomes of organisms. They serve as special markers in the genetic code, containing a wealth of genetic information. STR molecular markers are abundant, cover the entire genome, have a uniform distribution, high polymorphism, exhibit codominant inheritance, have good repeatability, and are easy to operate, which makes them an ideal molecular marker technology widely used in agriculture and the medical field.
This article offers a comprehensive review of the theoretical basis, experimental procedures, data analysis methods, and application value across multiple domains of SSR technology. While objectively evaluating its limitations, it aims to provide a systematic reference framework for both genetic research and practical applications.
Genetic markers have always been a research hotspot in the field of genetics. Microsatellite DNA markers are favored by many scholars as ideal genetic markers due to their specific amplification, stability, good repeatability, and ability to reflect the genetic structure and diversity changes of species.
Microsatellite (Simple Sequence Repeats) refers to short tandem repeat sequences composed of 1 to 6 nucleotides, which are widely distributed throughout eukaryotic genomes. SSR analysis is a molecular marker technique based on these repetitive sequences, offering distinct advantages including high polymorphism, codominance, stability, reproducibility, and relatively low analytical costs. This method has been extensively applied in genetic diversity studies, population structure analyses, gene mapping, fingerprinting profiling, cultivar identification, and marker-assisted breeding programs.
SSRs cover almost the entire genome and are abundant and evenly distributed, thus enabling their use in constructing high-density genetic maps. SSRs exhibit characteristics such as polymorphism, co-dominant inheritance, ease of detection, and broad applicability. They hold significant value in genetics, genomics, and biodiversity research due to their wide-ranging applications.
Polymorphism: SSRs exhibit high levels of polymorphism, which endows SSR markers with exceptional resolution in genetic diversity analysis and kinship identification. The polymorphism of SSRs primarily arises from mechanisms such as variation in the number of repeat units, nucleotide substitutions, high mutation rates, strand slippage mismatches, and unequal crossing-over. In different individuals or cultivars, the number of repeat units at the same SSR locus can vary significantly, resulting in length-based polymorphism. Additionally, nucleotide substitutions, locus-specific mutations, strand slippage, and unequal crossing-over further enhance SSR polymorphism. Due to their pronounced variability across individuals or cultivars, SSR markers have become a pivotal molecular marker technology in genetics and genomics research.
Co-dominant inheritance: SSR markers enable the detection of allelic differences in heterozygotes, allowing for the discrimination between homozygous and heterozygous individuals, thereby providing investigators with more comprehensive genetic information. Primers for SSR markers are designed based on the conserved flanking sequences of microsatellites, ensuring specific amplification of the target repeat regions. The length of amplified products varies depending on the number of repeat units in different alleles. These fragments are then separated by electrophoresis according to size differences in PCR products, ultimately enabling the identification of distinct allelic variants.
Ease of detection: The SSR analysis workflow primarily relies on PCR amplification technology, making the entire process straightforward to implement. Furthermore, due to the highly conserved nature of microsatellite sequences, the corresponding primer design is relatively simple. These primers can also incorporate fluorescent labeling, which enhances the sensitivity and accuracy of SSR markers while enabling automated detection. The amplified SSR products can be analyzed using standard gel electrophoresis systems or fluorescence-based analysis software. Results are rapidly interpretable through size-based differentiation of amplified fragments.
SSR analysis has been extensively utilized in genetic studies, germplasm characterization, breeding selection, and kinship analysis due to its defining characteristics: high polymorphism, co-dominant inheritance, abundance in genomic distribution, ease of detection, robust stability, and cost-effectiveness. These technical advantages position it as a cornerstone methodology across diverse domains of modern genetic research.
Service you may interested in
Learn More
The key to microsatellite analysis is to develop primers on both sides of a microsatellite in the genome, which can be analyzed by electrophoresis after PCR amplification.
Although microsatellite DNA is distributed at different positions throughout the genome, the flanking sequences at both ends of a specific microsatellite are usually highly conserved single sequences. The repeated sequence and its DNA fragments on both sides are cloned and sequenced, and then a pair of specific primers are designed based on the flanking sequences at both ends to amplify the target microsatellite DNA fragment through PCR technology.
Due to the difference in the number of repeating units at a single microsatellite locus, the amplified product undergoes length variation, resulting in length polymorphism, which is called simple sequence length polymorphism (SSLP), each amplification site represents an equivalent allele of that site.
The polymorphism of SSR markers mainly depends on the variation in the number of basic unit repeats, which is abundant in biological populations. Therefore, SSR has a large number of allelic differences and rich polymorphism.
Principle of simple sequence repeat (SSR) markers (Yang et al.,2015)
Sample Collection: Select samples based on research objectives (e.g., different geographic populations, varieties).
Primer Selection: Use published species - specific SSR primers first, or develop new ones through genome sequencing.
DNA Extraction and Detection: Extract high - quality genomic DNA (using CTAB method or kits), and check purity via agarose gel electrophoresis.
PCR Amplification: Amplify with fluorescent or regular primers, and optimize annealing temperature (gradient test from 50 - 65℃).
Electrophoresis Detection: Agarose Gel Electrophoresis: Preliminarily verify amplicon size; PAGE/Capillary Electrophoresis: Distinguish alleles with high resolution.
SSR molecular marker experimental process
Generally speaking, there are two methods to detect SSR target bands. The first one is polyacrylamide gel electrophoresis, commonly known as "big board gel", which is an electrophoretic migration experiment using the high-resolution (1bp) characteristics of polyacrylamide gel. After the electrophoresis experiment is completed, the silver staining method is used for development and imaging. The second method: capillary electrophoresis, also known as 4-color fluorescent capillary electrophoresis, adds biotin to the primers to detect the target fragment with different fluorescence during sequencing. This method has a resolution of up to 0.1bp and higher efficiency and accuracy.
Capillary result graph of SSR analysis (Du et al.,2017)
Electrophoretic results of SSR Analysis (Zhou et al.,2017)
SSR Mining Tools: MISA (MIcroSAtellite identification tool), a widely used SSR identification tool that can quickly identify and locate SSR sites and provide detailed reports; SSR Finder, can automatically identify SSR sequences in genomes or transcriptomes; Tandem Repeats Finder, focuses on identifying tandem repeat sequences and can handle large genome data.
Sequence Alignment and Annotation Tools: BLAST (Basic Local Alignment Search Tool), used to align SSR sequences with known genome sequences to find highly similar sequences and understand the function of the genes or genome regions where SSRs are located; UniProt, gene Ontology, and other databases, used for functional annotation and provide rich gene function information.
Statistical Analysis Tools: R and Python, scripts can be written to achieve custom SSR analysis, such as frequency analysis, length distribution analysis, and genome distribution analysis.
Population Genetics Analysis Tools: Popgene, MEGA, PHYLIP, ARLEQUIN, used to analyze genetic diversity, population structure, and linkage disequilibrium of SSR data; Structure, used for population structure analysis and to infer population affiliation of individuals through SSR data analysis; Tassel, used for association analysis and population genetics analysis; SPAGeDi, used to analyze spatial genetic structure.
Data Format Conversion Software: DataTrans 1.0, an SSR data processing program developed with Microsoft VBA that can convert raw SSR data into input file formats for commonly used molecular population genetics analysis software; DataFormater, features a user-friendly graphical interface and can efficiently and accurately convert raw SSR data into input file formats for various molecular population genetics analysis software.
Data Visualization Tools: R's ggplot2 package, Python's matplotlib and seaborn libraries, used to plot frequency distribution, length distribution, and genome distribution charts for SSRs.
Partial SSR data analysis results (Garino et al.,2014; Zhu et al., 2023; Ibrahimi et al.,2023; Shraddha et al., 2023)
The polymorphism characteristics of SSR analysis make it widely applicable in tracking genetic variation, population evolution, and disease associations.
SSR markers based on PCR amplification technology have broad prospects in germplasm identification. They offer advantages in germplasm identification, genetic diversity analysis, and marker-assisted breeding. This provides a reference for the research, protection, and breeding of plants, and effectively protects the genetic diversity of precious or endangered plants. Genetic diversity determines plants' environmental adaptability and breeding potential. Assessing it through SSR markers is crucial for crop improvement, as it reveals genetic differences and diversity levels, aiding in species conservation and breeding. Many studies have used SSR technology to evaluate genetic diversity in crops like rice, wheat, corn, tomatoes, and cotton.
SSRs are distributed in both coding and non-coding regions, and approximately 3% of the human genome is composed of SSRs. SSRs are the root cause of many genetic diseases. Due to differences in core units and repetition rates, the distribution of SSRs exhibits significant variability across different races, populations, and individuals. SSR, as a genetic marker, has been widely used in many fields such as genetic mapping, gene localization, forensic identification, anthropology, and diagnosis of genetic diseases. SSR variations in the genome are closely related to human genetic diseases. The molecular mechanisms of SSR repeat pathogenicity involve loss of function and gain of toxic function, with repeat numbers linked to disease phenotype severity. Over 40 diseases, mainly neurodegenerative ones, have phenotypes tied to SSR expansions. These include: Fragile X syndrome (FXS; FMR1), Huntington's disease (HD; HTT), Spinocerebellar ataxias, Hereditary cerebellar ataxias (RFC1; FXN, etc.), Myotonic dystrophy (DMPK; CNBP), Epilepsy with myoclonus (CSTB; SAMD12; STARD7, etc.).
SSR molecular markers are widely used in many fields due to their low cost and ease of operation. However, there are some challenges and limitations. The following is an analysis of the advantages and problems of SSR molecular markers:
SSR molecular markers have the following advantages: (1) short fragments, which can be complexly amplified and improve recognition ability; (2) high degree of polymorphism, multiple alleles, and strong discriminatory ability; (3) suitable for old samples, with strong detection ability; (4) high sensitivity, conducive to the detection of trace samples; (5) easy to automate, standardize, and have good repeatability in detection methods.
SSR molecular markers remain popular due to their ease of use and reliability. Any standard laboratory with a PCR machine and gel imaging system can perform them. Nowadays, with the progress in genome sequencing, whole - genome sequence information for many species is easily and accurately obtained. This offers a solid foundation for developing new SSR molecular markers.
The efficiency of SSR screening for polymorphic markers is low, and there are limitations in the scarcity of SSR polymorphic loci. Simple SSR and other technologies have low throughput and incomplete coverage, making it difficult to meet the needs of massive samples and large-scale breeding, which restricts the popularization of molecular markers and delays the transformation of results. Meanwhile, most researchers developing SSR markers may be based on first generation sequencing or blind guessing based on the characteristics of SSR, which not only increases the workload but also incurs expensive trial and error costs. For example, SSR analysis requires knowledge of the DNA sequence at both ends of the repeated motif. If it is not possible to directly search from the DNA database, it is necessary to first sequence it and then design primers, which incurs high development costs. Therefore, we need to closely integrate the latest high-throughput technologies to reduce the trial and error costs of intermediate processes.
References
For any general inquiries, please fill out the form below.
CD Genomics is propelling the future of agriculture by employing cutting-edge sequencing and genotyping technologies to predict and enhance multiple complex polygenic traits within breeding populations.