Immune Repertoire Sequencing is the core technology to analyze the diversity of T cell receptor (TCR) and B cell receptor (BCR) and reveal the immune response mechanism, and the selection of the sequencing platform directly determines the resolution, throughput, and application scenarios of Immune Repertoire Sequencing data. At present, the mainstream Immune repertoire sequencing technology platform can be divided into three categories: short-read platform, long-read platform, and single cell integrated sequencing platform.
The article discusses three major immune repertoire sequencing platforms (short-read, long-read, single-cell integrated), their principles, features, applications, comparative analysis, and selection guidelines.
The short-read sequencing platform is the most widely used technical system in Immune repertoire sequencing, with the Illumina series platform as the core. With the advantages of Qualcomm quantity, high accuracy, and low cost, it can efficiently capture TCR/BCR variable region sequences, especially suitable for large-scale immune diversity screening and dynamic tracking. Although the full-length receptor sequence cannot be obtained directly due to the limitation of reading length, it is still the first choice tool for the study of Immune repertoire sequencing at the population level by optimizing the experiment and analysis strategy.
The Immune repertoire sequencing experimental design of short-read and long-read platforms revolves around targeted amplification-library construction, high-throughput sequencing, and the key steps focus on specifically capturing variable region sequences and reducing amplification bias:
A. Targeted amplification strategy
a) Multiplex PCR primers were designed for the conserved sequences of V region and J region of TCR (α/β/γ/δ chain) and BCR (heavy chain/light chain), and the V-D-J fragment containing the CDR3 region (BCR is V-J or V-D-J fragment) can be amplified by one PCR.
B) In order to avoid the bias of subtype amplification caused by primer competition, it is necessary to optimize the primer concentration ratio (usually the concentration of each V/J primer is 10-20 μM) and control the number of PCR cycles (18-25 cycles) to reduce the over-amplification of high-abundance clones.
B. Library construction
a) Illumina platform-specific linker (P5/P7 linker) and sample Barcode(6-8 bp) need to be added to the amplified products. The introduction of the Barcode can realize the mixed sample sequencing of 12-96 samples, which significantly reduces the cost of a single sample. The library fragment size should be controlled at 200-500 bp.
b) After the fragment distribution is verified by Agilent Bioanalyzer, it is sequenced at a concentration of 10-8 pM. The NovaSeq platform can generate 600-1200 GB of data in a single run, which is enough to support the Immune Repertoire Sequencing analysis of hundreds of samples.
The structure of antigen-specific lymphocyte receptors and the generation of diversit (Liu et al., 2021)
A. Data processing
a) After the original data is subjected to FastQC quality control (filtering the reading segment with Phred quality value less than 20) and Cutadapt to remove the linker sequence, V (D) J gene allocation and CDR3 region identification are carried out by using tools such as MiXCR or IMGT/HiHV-Quest.
b) By comparing with the V/J gene sequence in IMGT database, researches can determine the V, D and J subtypes of each reading segment, and locate the starting and ending positions of CDR3 region.
c) Finally, define Clonotype through sequence clustering (usually taking CDR3 nucleotide sequence ≥97% as the threshold), and calculate the diversity index (Shannon index, Simpson index) and clone abundance distribution.
B. Technical limitations and optimization direction
a) The core bottleneck of the short-read platform stems from the lack of read length, which makes it impossible to directly parse the complete V (D) J recombination sequence:
Short-read tool performance across different thresholds of intron persistence (David et al., 2022)
The long-read sequencing platform is the key breakthrough to solve the bottleneck of short-read technology. With the core advantages of super-long reading length (10 kb-tens of kb), PacBio and Oxford Nanopore can directly capture the full-length sequence of TCR/BCR variable region (including complete V, D, and J fragments and CDR1-CDR2-CDR3 regions), and become a special tool for deeply analyzing the receptor structure in Immune Repertoire Sequencing. Its technical value lies in breaking through the dependence of sequence splicing, realizing the accurate analysis of immune receptor recombination mode and somatic mutation, especially suitable for the study of B-cell immune library and rare clones.
The promotion of long reading and long platform is limited by the bottleneck of low throughput and high cost:
To break through these limitations, the technical development directions in recent years include:
Overview of long-read analysis tools and pipelines (Amarasinghe et al., 2019)
Take the Next Step: Explore Related Services
Learn More
Traditional TCR-seq and BCR-seq are mostly based on bulk samples, which can only analyze the receptor diversity at the population level, but can't correlate the clonal type and functional state of a single cell. The single-cell integrated sequencing platform can capture the TCR/BCR sequence, transcriptome, and protein expression information of single cells synchronously through microfluidic and Cell Barcode technology, and realize the precise docking of clonal-cell phenotype-immune function, which provides a breakthrough tool for revealing the immune response mechanism of T cells and B cells.
The core of Illumina NovaSeq of the single-cell integration platform is to realize the matching of multi-omics data through Cell Barcode. Taking the 10x Genomics platform as an example, the experimental process is divided into four key steps:
Flow cytometric analysis of the effects of MnBuOE and irradiation on T cell populations (Noh et al., 2024)
The greatest value of a single-cell integration platform lies in revealing the direct relationship between cloning and function, and its application scenario completely changes the research boundary of traditional Immune Repertoire Sequencing:
A. Tracing the origin of pathological clones of autoimmune diseases
a) In the PBMCs analysis of patients with systemic lupus erythematosus (SLE), a class of B cells with BCR clones of IGHV3-23+IGKJ1+CD19+CD27-IgG+ were found on the BD Rhapsody platform, and the Klonga expressed IFN-α inducible gene (such as IFIT1). And the abundance was positively correlated with the disease activity (SLEDAI score).
b) Further experiments proved that the clone could secrete anti-dsDNA autoantibodies, which provided direct evidence for the pathological mechanism of SLE, while the traditional BCR-seq could not distinguish the phenotype (CD27-) from the function (IFN-α response) of the clone.
B. Clonal tracing of memory cells infected with immunity
a) In the analysis of PBMCs from recovered people in Covid-19, a class of TRB clone CASSQDRGDTQYF+CD8+CD45RO+CCR7-effector- effector memory T cells were traced by the 10x Genomics platform, and the clone remained high in abundance (1.5%) and highly expressed antiviral genes (such as GZMB and PRF1) after 6 months of rehabilitation.
The technical characteristics and application value of the three types of IR-seq platforms are significantly different, and the multi-dimensional comparison of their core performance indicators is the key basis for platform selection. The following is a systematic comparison of six dimensions: flux, resolution, accuracy, cost, sample compatibility, and core application, and analyzes their complementarity.
The technical characteristics and application value of the three types of IR-seq platforms are significantly different, and the multi-dimensional comparison of their core performance indicators is the key basis for platform selection. The following is a systematic comparison from seven dimensions: flux, resolution, accuracy, cost, sample compatibility, core application and data analysis:
Three major plotforms comparison
| Performance Metrics | Short-Read Platform (Illumina) | Long-Read Platform (PacBio/ONT) | Single-Cell Integrated Platform (10x Genomics) |
|---|---|---|---|
| Sequencing Throughput | Extremely high (millions–billions of reads/sample) | Medium-low (tens of thousands–millions of reads/sample) | Medium (500–10,000 cells/sample) |
| Immune Repertoire Resolution | Clone-level (cannot link to cell function) | Full-length receptor-level (cannot link to cell function) | Single-cell–clone–function level (highest resolution) |
| Sequence Accuracy | High | Medium-high | High |
| Cost per Sample | Low | High | Extremely high |
| Sample Compatibility | Wide (fresh/frozen nucleic acids, tissues) | Wide (fresh/frozen nucleic acids, small sample sizes) | Limited (fresh viable cells, sufficient quantity required) |
| Core Application | Large-scale screening, population dynamics tracking | Full-length receptor analysis, SHM analysis | Clone-function association, precise mechanism research |
| Data Analysis Complexity | Low (standard V(D)J analysis tools) | Medium (requires error correction and full-length filtering) | High (multi-omics data integration) |
The three types of platforms are not substitutes, but can achieve comprehensive coverage of "breadth-depth-accuracy" through joint use. Typical joint strategies include:
The ALLPATHS assembly of S. aureus (Maccallum et al., 2009)
The short-read, long-read, and single cell integration of immuno-library sequencing have achieved breakthroughs in the dimensions of flux, reading length, and resolution, respectively, and built a complete system covering population screening to single cell mechanism research. Among them:
The synergistic application of the three promotes the study of Immune repertoire sequencing from the description of immune diversity to the analysis of cloning functional mechanism and the mining of clinical markers.
In the future, with the upgrading of sequencing technology and the optimization of bioinformatics tools (AI-driven V (D) J allocation algorithm and multi-platform integration software), Immune repertoire sequencing will play a greater role in immune research and clinical application. Basic research can help to explore the maintenance of immune memory and the pathological mechanism of autoimmune diseases. In clinical application, it can develop disease diagnosis markers and immunotherapy prediction models to provide technical support for the prevention and control of immune diseases.
CD Genomics provides comprehensive, end-to-end immune repertoire sequencing services, leveraging Illumina short-read and PacBio/ONT long-read platforms. Contact now to discuss your project and learn how our tailored immune repertoire sequencing services can advance your research.
References
CD Genomics is transforming biomedical potential into precision insights through seamless sequencing and advanced bioinformatics.