The introduction to non-coding RNAs
Non-coding RNAs (ncRNAs) used to be considered as transcription noises or byproducts of RNA processing, but increasing evidence suggests that a majority of them are biologically functional and regulate various activities in the cells. The ncRNAs are roughly classified into two categories according to their sequence length: small ncRNAs (<200 bp) and long ncRNAs (200 bp or more). The categories of ncRNA are listed in Table 1.
Table 1. Overview of ncRNA (Fu 2014).
|rRNA||Ribosomal RNA||Translational machinery|
|tRNA||Transfer RNA||Amino acid carriers|
|snRNA||Small nuclear RNA||RNA processing|
|snoRNA||Small nucleolar RNA||RNA modifications|
|TR||Telomere RNA||Chromosome end synthesis|
|miRNA||MicroRNAs||RNA stability and translation control|
|endo-siRNA||Endogenous siRNA||RNA degradation|
|rasiRNA||Repeat-derived RNA||Transcriptional control|
|piRNA||Piwi-interacting RNA||Silencing transposon and mRNA decay|
|eRNA||Enhancer-derived RNA||Regulation of gene expression|
|PATs||Promoter-associated RNA||Transcription initiation and pause release|
|lncRNA||Long non-coding RNA||Imprinting, epigenetics, nuclear structure|
As shown in Table 1, ncRNAs can be roughly divided into two classes: housekeeping ncRNAs and regulatory ncRNAs. Housekeeping ncRNAs, involving rRNA, tRNA, snRNA, snoRNA, and TR, are considered “constitutive” since they are ubiquitously expressed in all cell types and offer essential functions to the organisms. Regulatory ncRNAs, involving miRNA, endo-siRNA, rasiRNA, piRNA, eRNA, PATs, and lncRNA, have received increasing attention from the research community due to their regulatory function in gene expression, imprinting, and epigenetics. RNA-seq is an advanced technique to illustrate the ncRNA species. Here, we made a summary of the bioinformatics tools for ncRNA analysis with data from NGS.
Figure 1. ncRNAs as integrated parts of gene network (Fu 2014).
Small ncRNA analysis
Small RNAs play a crucial role in transcriptional regulation and are essential to fully understand the entire scenario of transcriptional regulation. Their aberrant expression profiles are considered to be associated with cellular dysfunction and disease. Therefore, many researches are focused on detection, prediction, or expression quantification of small RNAs, particularly miRNAs, to better understand human health and disease. The available computational tools for small RNA sequencing data are summarized in Table 2.
Table 2. Computational tools for small ncRNA analysis
|DARIO||Quantify and annotate ncRNAs with access to several ncRNA public databases.|
|CPSS||Quantify and annotate ncRNAs, with special emphasis on miRNAs.|
|ncPRO-seq||Detect known small ncRNAs in an unbiased way and discover novel ncRNA species.|
|CoRAL||Divide small ncRNA into functional categories based on biologically interpretable features other than sequence;
Annotate ncRNA in less well-characterized organisms.
|RNA-CODE||Combine secondary structure with de novo assembly.
Applicable to ncRNA annotation lacking reference genomes.
|miRDeep||Used to detect both known and novel miRNAs in small RNA sequencing data.|
Circular RNA detection
CircRNAs are a novel type of RNA that form a covalently closed continuous loop. Most of them are generated from exonic or intronic sequences, and RNA-binding proteins (RBPs) or reverse complementary sequences are necessary for their biogenesis. CircRNAs are mostly conserved, and function as miRNA sponges, regulator of splicing and transcription, or modifiers of parental gene expression. Increasing evidence suggests the potential significance of circRNA in human diseases, such as atherosclerotic vascular disease, neurological disorders, and cancer. Among all the presented tools for circRNA detection, CIRI, CIRCexplorer, and KNIFE exhibit a balanced performance between precision and sensitivity. The available computational tools for circRNA sequencing data are summarized in Table 2.
Table 3. Computational tools for circular RNA detection.
|CIRI||Segmented read-based||Bwa, peri|
|CIRCexplorer||Segmented read-based||STAR, bedtools, python (pysam, docopt, Interval)|
|KNIFE||Candidate-based||Bowtie, Bowtie2, tophat2, samtools, perl|
LncRNA is a type of non-coding RNA with more than 200 nucleotides, such as lincRNAs and macroRNAs. LncRNAs function as a platform for the interaction with mRNA, miRNA, or protein. They have emerged as vital regulators in diverse aspects of biology, including transcriptional regulation, post-transcriptional regulation, and chromatin remodeling. Increasing researches suggest misexpression of lncRNAs contributes to tumor initiation, growth, and metastasis. LncRNAs hence become a promising target for cancer diagnosis and therapy. The combination of lncRNA sequencing and matched computational tools is a powerful approach for this purpose.
Table 4. Computational tools for lncRNA investigation.
|lncRScan||Detect lncRNA from the complex assemblies; Distinguish lncRNA from mRNAs||(Sun et al., 2012)|
|iSeeRNA||Accurately and quickly detect lincRNA from large datasets||(Sun et al., 2013)|
|Annocript||Detect lncRNA by leveraging public databases and sequence analysis software to verify high non-coding potential||(Musacchia et al. 2015)|
|LncRNA2Function||Annotate lncRNA based on the theory that similar expression patterns across diverse conditions may share similar functions and biological pathways.||(Jiang et al. 2015)|