基于支持向量机分类的RNA共同二级结构预测被引量：1

RNA Secondary Structure Prediction Based on Support Vector Machine Classification

下载PDF

导出

摘要比较序列分析作为RNA二级结构预测的最可靠途径,已经发展出许多算法。将基于此方法的结构预测视为一个二值分类问题:根据序列比对给出的可用信息,判断比对中任意两列能否构成碱基对。分类器采用支持向量机方法,特征向量包括共变信息、热力学信息和碱基互补比例。考虑到共变信息对序列相似性的要求,通过引入一个序列相似度影响因子,来调整不同序列相似度情况下共变信息和热力学信息对预测过程的影响,提高了预测精度。通过49组Rfam-seed比对的验证,显示了该方法的有效性,算法的预测精度优于多数同类算法,并且可以预测简单的假节。 The comparative sequence analysis is the most reliable method for RNA secondary structure prediction, and many algorithms based on it have been developed in last several decades. This paper considers RNA structure prediction as a 2-classes classification problem： given a sequence alignment, to decide whether or not two columns of alignment form a base pair. We employed Support Vector Machine（SVM） to predict potential paired sites, and selected co-variation information, thermodynamic information and the fraction of complementary bases as feature vectors. Considering the effect of sequence similarity upon co-variation score, we introduced a similarity weight factor, which could adjust the contribution of co-variation and thermodynamic information toward prediction according to sequence similarity. The test on 49 Rfam-seed alignments showed the effectiveness of our method, and the accuracy was better than many similar algorithms. Furthermore, this method could predict simple pseudoknot.

作者赵英杰王正志

机构地区国防科技大学机电工程与自动化学院

出处《生物工程学报》 CAS CSCD 北大核心 2008年第7期1140-1148,共9页 Chinese Journal of Biotechnology

关键词比较序列分析 RNA二级结构支持向量机相似性影响因子 comparative sequences analysis, RNA secondary structure, support vector machine, similarity weight factor

分类号 Q522 [生物学—生物化学]

引文网络
相关文献

参考文献51

1Rivas E, Eddy SR. A dynamic programming algorithm for RNA structure prediction including pseudoknots. Journal of Molecular Biology, 1999, 285(5): 2053-2068.
2Zuker M. Calculating nucleic acid secondary structure. Current Opinion in Structural Biology, 2000, 10(3):303-310.
3Horesh Y, Doniger T, Michaeli S, et al. RNAspa: a shortest path approach for comparative prediction of the secondary structure of ncRNA molecules. BMC Bioinformatics, 2007, 8: 366.
4Sakakibara Y, Brown M, Hughey R, et al. Stochastic context-free grammars for tRNA modeling. Nucleic Acids Research, 1994, 22(23): 5112-5120.
5Knudsen B, Hein J. RNA secondary structure prediction using stochastic context-free grammars and evolutionary history, Bioinformatics, 1999, 15(6): 446-454.
6Searls DB. Linguistic approaches to biological sequences. Computer Applications in the Biosciences, 1997, 13(4): 333-344.
7Ding Y, Lawrence CE. A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Research, 2003, 31(24): 7280-7301.
8James BD, Olsen GJ, Pace NR. Phylogenetic comparative analysis of RNA secondary structure. Methods Enzymol, 1989, 180: 227-239.
9Winker S, Overbeek R, Woese CR, et al. Structure detection through automated covariance search. Computer Applications in the Biosciences, 1990, 6(4): 365-371.
10Eddy SR, Durbin R, RNA sequence analysis using covariance models, Nucleic Acids Research, 1994, 22(11): 2079-2088.

同被引文献1

1Cis-acting regulatory elements： from random screening to quantitative design[J].Frontiers of Electrical and Electronic Engineering in China,2015,10(3):107-114. 被引量：6

引证文献1

1Hailin Meng,Yingfei Ma,Guoqin Mai,Yong Wang,Chenli Liu.Construction of precise support vector machine based models for predicting promoter strength[J].Frontiers of Electrical and Electronic Engineering in China,2017,5(1):90-98. 被引量：2

二级引证文献2

1王也,王昊晨,晏明皓,胡冠华,汪小我.生物分子序列的人工智能设计[J].合成生物学,2021,2(1):1-14.
2王晟,王泽琛,陈威华,陈珂,彭向达,欧发芬,郑良振,孙瑨原,沈涛,赵国屏.基于人工智能和计算生物学的合成生物学元件设计[J].合成生物学,2023,4(3):422-443. 被引量：4

1邹权,郭茂祖,张涛涛.RNA二级结构预测方法综述[J].电子学报,2008,36(2):331-337. 被引量：24
2唐四薪,周勇,易胤.随机文法模型在RNA二级结构预测中的应用[J].生物数学学报,2008,23(4):735-742. 被引量：2
3秦彤,苗向阳.microRNA及其应用研究进展[J].生物技术通讯,2011,22(1):98-103. 被引量：1
4皮安平.DNA聚合酶与DNA连接酶的作用[J].才智,2009,0(15):176-176.
5Khaled Sayed Ahmed Nahed H. Solouma Yasser M. Kadah.Determining the Relations between Protein Sub-Function Categories Based on Overlapping Proteins[J].通讯和计算机（中英文版）,2011,8(3):240-245.
6周晓柳,傅继梁.PCR在基因突变研究中的应用[J].遗传与疾病,1991,8(1):28-31.
7冯剑丰,王洪礼,李胜朋.基于支持向量机的浮游植物密度预测研究[J].海洋环境科学,2007,26(5):438-441. 被引量：3
8孙金花,李海霞,段增强.番茄分枝方式的形态学和解剖学观察研究[J].河南科学,2014,32(5):726-729.
9何冰,宋晓峰.基于蛋白质序列的泛素化位点预测研究进展[J].现代生物医学进展,2012,12(18):3573-3576.
10陈旭,李晚忱,付凤玲.玉米microRNAs及其靶基因的生物信息学预测[J].遗传,2009,31(11):1149-1157. 被引量：4

生物工程学报

2008年第7期

浏览历史

内容加载中请稍等...

基于支持向量机分类的RNA共同二级结构预测被引量：1

参考文献51

同被引文献1

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于支持向量机分类的RNA共同二级结构预测 被引量：1

参考文献51

同被引文献1

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于支持向量机分类的RNA共同二级结构预测被引量：1