期刊文献+

基于PCA和神经网络预测长链非编码RNA

Prediction of Long Non-Coding RNAs Based on PCA and Neural Network
下载PDF
导出
摘要 长链非编码RNA是一类不编码蛋白质且长度不小于200 nt的转录本。近些年来,长链非编码RNA被发现在生命活动中发挥着重要的作用,而研究其功能的第一步就是准确识别出长链非编码RNA。在本文中,我们基于主成分分析和多层感知机提出了一种识别长链非编码RNA的新方法。我们选择转录本的k-mer作为原始特征向量,使用主成分分析进行降维得到新的特征向量,将其输入到一个含有五个隐藏层的多层感知机中来预测其是否为长链非编码RNA。我们使用人类、小鼠和斑马鱼物种的转录物序列来评估我们所提出的方法,最终在上述物种的normal类型测试集上准确率分别为94.74%,93.25%和93.04%。 Long non-coding RNAs are transcripts composed of more than 200 nucleotides that do not encode proteins. In recent years, long non-coding RNAs have been found to play important roles in many biological mechanisms, and the first step to study their functions is to identify long non-coding RNAs accurately. In this paper, we propose a novel method to identify long non-coding RNAs based on principal component analysis and multilayer perceptron. We select the k-mer of the transcript as the original feature vectors and use principal component analysis to reduce the dimension to obtain new feature vectors. The new feature vector of transcript was fed into a multilayer perceptron with five hidden layers to predict the coding ability of the transcript. We used the transcript sequences of human, mouse and zebrafish to evaluate our proposed method and achieved 94.74%, 93.25% and 93.04% accuracies on the normal type test set of the above species, respectively.
作者 曹冰倩
机构地区 青岛大学
出处 《应用数学进展》 2022年第9期6670-6677,共8页 Advances in Applied Mathematics
  • 相关文献

参考文献2

二级参考文献21

共引文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部