Species diversity is a measure of biological diversity in a specific ecological community. It considers three important ecological concepts: species richness, species abundance, and species evenness. Species richness is defined as the number of species present in a particular area, while species abundance is the number of individuals per species present in that area. On the other hand, the distribution or evenness of the species present in that area is termed as species evenness. Species diversity is an important ecological concept in evaluating the healthiness of an ecosystem since the existence of a diverse and balanced number of species maintains the balance of an ecosystem. A more diverse ecosystem tends to be more productive and has a greater ability to withstand environmental stresses. Also, an ecosystem with greater species richness has higher productivity making it more sustainable and stable and could respond to more catastrophes.
Two important measures of biodiversity with regards to spatial scale are alpha and beta diversity. Alpha diversity is the average species diversity in a particular area or habitat and is also termed as local diversity. It is a measure of microbiome diversity applicable to a single sample. On the other hand, beta diversity is the ratio between alpha diversity and regional diversity. It is the diversity of species between two habitats or the measure of similarity or dissimilarity of two regions. Alpha diversity gives an overview of the structure of an ecological community with respect to its species richness, species evenness, or both. In microbial ecology, a common initial approach to assess the difference between environments is through the analysis of alpha diversity of amplicon sequencing data.
The statistical technique or method used to evaluate species richness from the results of sampling is rarefaction. This technique is often applied to operational taxonomic unit analysis (OTUs) and is very useful in pollution and evolutionary ecology. Rarefaction can be used to determine whether a specific sample has been sufficiently sequenced to represent its identity. This can also be used to infer whether a group of samples are from the same community.
Figure 1. An example of sample-based rarefaction curves. (Boussarie, 2018)
Rarefaction involves the selection of a certain number of samples which is either equal or less than to the number of samples (in the smallest sample), and then randomly discarding reads from the larger samples until the number of remaining samples is equal to the threshold. It is often done by subsampling without replacement, which means that each read that is selected and assigned to the normalized sample will not be included in the original pool of samples. This makes the data retained as a count data which allows it to be used for further analyses using other statistical tools.
The calculation of species richness for a given number of samples is based on the rarefaction curve. The rarefaction curve is a plot of the number of species against the number of samples. This curve is created by randomly re-sampling the pool of N samples several times and then plotting the average number of species found on each sample. Generally, it initially grows rapidly (as the most common species are found) and then slightly flattens (as the rarest species remain to be sampled).