Research output: Contribution to journal › Article › peer-review
Comparative analysis of protein-coding and long non-coding transcripts based on RNA sequence features. / Volkova, Oxana A.; Kondrakhin, Yury V.; Kashapov, Timur A. et al.
In: Journal of Bioinformatics and Computational Biology, Vol. 16, No. 2, 1840013, 01.04.2018.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Comparative analysis of protein-coding and long non-coding transcripts based on RNA sequence features
AU - Volkova, Oxana A.
AU - Kondrakhin, Yury V.
AU - Kashapov, Timur A.
AU - Sharipov, Ruslan N.
N1 - Publisher Copyright: © 2018 World Scientific Publishing Europe Ltd.
PY - 2018/4/1
Y1 - 2018/4/1
N2 - RNA plays an important role in the intracellular cell life and in the organism in general. Besides the well-established protein coding RNAs (messenger RNAs, mRNAs), long non-coding RNAs (lncRNAs) have gained the attention of recent researchers. Although lncRNAs have been classified as non-coding, some authors reported the presence of corresponding sequences in ribosome profiling data (Ribo-seq). Ribo-seq technology is a powerful experimental tool utilized to characterize RNA translation in cell with focus on initiation (harringtonine, lactimidomycin) and elongation (cycloheximide). By exploiting translation starts obtained from the Ribo-seq experiment, we developed a novel position weight matrix model for the prediction of translation starts. This model allowed us to achieve 96% accuracy of discrimination between human mRNAs and lncRNAs. When the same model was used for the prediction of putative ORFs in RNAs, we discovered that the majority of lncRNAs contained only small ORFs (≤300nt) in contrast to mRNAs.
AB - RNA plays an important role in the intracellular cell life and in the organism in general. Besides the well-established protein coding RNAs (messenger RNAs, mRNAs), long non-coding RNAs (lncRNAs) have gained the attention of recent researchers. Although lncRNAs have been classified as non-coding, some authors reported the presence of corresponding sequences in ribosome profiling data (Ribo-seq). Ribo-seq technology is a powerful experimental tool utilized to characterize RNA translation in cell with focus on initiation (harringtonine, lactimidomycin) and elongation (cycloheximide). By exploiting translation starts obtained from the Ribo-seq experiment, we developed a novel position weight matrix model for the prediction of translation starts. This model allowed us to achieve 96% accuracy of discrimination between human mRNAs and lncRNAs. When the same model was used for the prediction of putative ORFs in RNAs, we discovered that the majority of lncRNAs contained only small ORFs (≤300nt) in contrast to mRNAs.
KW - discriminant analysis
KW - human lncRNAs
KW - Human mRNAs
KW - IPSmatrix algorithm
KW - position weight matrix approach
KW - small ORFs
KW - MOUSE
KW - MUSCLE
KW - GENE-REGULATION
KW - TRANSLATION
KW - CONSERVATION
KW - DYNAMICS
KW - PREDICTIONS
KW - ARABIDOPSIS
KW - REVEALS
KW - Protein Biosynthesis
KW - Open Reading Frames
KW - Ribosomes/genetics
KW - Computational Biology/methods
KW - Proteins/genetics
KW - RNA, Messenger/genetics
KW - 3' Untranslated Regions
KW - Algorithms
KW - Sequence Analysis, RNA
KW - 5' Untranslated Regions
KW - RNA, Long Noncoding
UR - http://www.scopus.com/inward/record.url?scp=85046857488&partnerID=8YFLogxK
U2 - 10.1142/S0219720018400139
DO - 10.1142/S0219720018400139
M3 - Article
C2 - 29739305
AN - SCOPUS:85046857488
VL - 16
JO - Journal of Bioinformatics and Computational Biology
JF - Journal of Bioinformatics and Computational Biology
SN - 0219-7200
IS - 2
M1 - 1840013
ER -
ID: 13360772