Full-Length Transcriptome Sequencing for The Analysis of Gene Isoforms

CD Genomics Blog

Explore the blog we've developed, including genomic education, genomic technologies, genomic advances, and genomics news & views.

Posted on December 7, 2021

Transcriptome sequencing is important to researchers because they help us understand how the cellular machinery interprets genomic sequences and alterations to these sequences. They’re also necessary for a lot of functional analyses. It is impossible to conduct RNA-seq research to explore differential gene expression or predict which proteins are present in a tissue or organism without proper transcriptome annotations.

Now that sequencing technology has advanced to the point where it may soon be possible to generate high-quality genome references and transcriptome annotations for a considerably wider range of organisms, including unicellular eukaryotes, at a lower cost than previously possible. While genomes will soon be assembled into chromosome-scale scaffolds with reasonable reliability, technological limitations prevent transcriptome annotation techniques from identifying genes and isoforms expressed from these chromosomes.

High-quality ‘centromere-to-telomere’ genome sequences can now be built using a combination of technologies such as short-read sequencing, linked short-read sequencing, long-read sequencing, and optical mapping. And genome assembly is now entering a golden age. Non-model organisms ranging from unicellular eukaryotes to polar bears will benefit greatly from these powerful and relatively inexpensive approaches, which previously lacked the attention and large sums of money required to generate a high-quality genome reference the hard way—chromosome maps, Sanger sequencing of bacterial artificial chromosome libraries, and so on.

The majority of these genome-assembly breakthroughs aren’t applicable to transcriptome annotation. Short- and long-read sequencing techniques are now utilized for transcriptome annotation, however, they both have flaws that make achieving a "reference-level" transcriptome annotation both time-consuming and often difficult.

Full-Length Transcriptome Sequencing Workflow for Isoforms

300 ng of total RNA is used for isoform sequencing. Following the selection of poly-A RNA, it is transformed to cDNA in preparation for library assembly. Additionally, up to 12 samples can be multiplexed for a simplified, cost-effective method.

After the sequencing data is obtained, reads flanked by cDNA primers and poly-A tails are selected as full-length reads. The readings can then be grouped by transcript isoform to produce a unique consensus.

The isoforms can then be mapped to a reference genome, and tools like SQANTI and Maker can be used to annotate the isoforms in a reference-based way. Iso-Seq analysis from PacBio features push-button bioinformatics procedures to handle the isoform data without the need for reference genomes or annotations for a de novo method.

Applications of Full-Length Transcript Sequencing

Genome annotation is the first application of full-length transcript sequences. This is particularly useful in plant and animal sciences, where genomes are frequently significantly more complicated than human genomes and a reference-quality genome assembly may still be prohibitively expensive.

Full-length transcripts are also helping to characterize genes in humans that were previously difficult to identify due to segmental duplications. The researchers looked at 19 gene families with long segmental duplications that are expressed in the brain and discovered that nearly half of the expressed gene duplicates had changed significantly from their ancestral models due to novel transcription initiation, splicing, and polyadenylation sites.

Others have employed full-length RNA sequencing to analyze alternative splicing and polyadenylation profiles in a similar way, but this time between species.

References

Gao X, Guo F, Chen Y, et al. Full-length transcriptome analysis provides new insights into the early bolting occurrence in medicinal Angelica sinensis. Scientific Reports. 2021 Jun 21;11(1).
He Z, Su Y, Wang T. Full-Length Transcriptome Analysis of Four Different Tissues of Cephalotaxus oliveri. International Journal of Molecular Sciences. 2021 Jan;22(2).
Byrne A, Cole C, Volden R, Vollmers C. Realizing the potential of full-length transcriptome sequencing. Philosophical Transactions of the Royal Society B. 2019 Nov 25;374(1786).