期刊文献+

随机森林算法在β-发夹模体预测中的应用

Application of Random Forests Algorithm in β-hairpins Motif Prediction
下载PDF
导出
摘要 基于对β-发夹模体预测的探索,文章尝试使用新的预测方法,即随机森林算法,以离散增量、矩阵打分值和预测的二级结构信息为特征参数,对Arch DB40数据库中loop长为2-8个氨基酸残基的β-发夹模体进行预测,将数据集平均分成5份,其中1份做训练集、4份做检验集,独立检验的预测精度为79.4%,相关系数为0.48。此外,对Arch DB40数据库中的β-发夹模体进行预测,在特征参数和检验方法相同的情况下,随机森林算法的预测效果要好于支持向量机(SVM)。 Based on the exploration of recognizing β-hairpins motif,we present a novel method,random forests algorithm is proposed in this paper. By using the increment of diversity,the position weight matrix score and the predicted secondary structure as a characteristic parameter. The prediction was based on the β-hairpin motifs in Arch DB40 dataset. The motifs with the loop length of 2 to 8 are extracted as research object. the dataset was divided into five sets in this paper,one was used as training set and the others were used as testing set. The overall accuracy of prediction and Matthew's correlation coefficient are 79. 4% and 0. 48 in the independent testing. In addition,to predict the β-hairpin motifs in Arch DB40 dataset,under the condition of the same characteristic parameters and testing methods,the prediction effect of random forest algorithm is better than the support vector machine( SVM).
作者 贾少春
机构地区 忻州师范学院
出处 《忻州师范学院学报》 2015年第5期6-9,28,共5页 Journal of Xinzhou Teachers University
关键词 随机森林算法 离散增量 矩阵打分函数 Β-发夹模体 random forests algorithm increment of diversity scoring matrix β-hairpins motif
  • 相关文献

参考文献28

  • 1Kuhn M, Meiler J,Baker D. Strand - Loop - Strand Motifs:Prediction of Hairpins and Diverging Turns in Proteins [ J ]. PRO- TEINS : Structure, Function, Bioinformatics, 2004,54 (2) : 282 - 288.
  • 2Wintjens R T, Rooman M J, Wodak S J. Automatic classification and analysis of alpha alpha - Turn Motifs in Proteins [ J ]. Jour- nal of Molecular Biology, 1996,255 ( 1 ) :235 - 253.
  • 3Jones DT. Protein secondary structure prediction based on position- specific scoring matrices[ J ]. J. Mol. Biol, 1999,292 (2) : 195 - 202.
  • 4Cruz X, Hutchinson E G, Shepherd A. Toward predicting protein topology : An approach to identifying β hairpins [ J ]. Proceed- ings of the National Academy Sciences of the USA ,2002,99 (17) :11157 -11162.
  • 5Kumar M, Bhasin M, Natt N K, etc. BhairPred : prediction ofβ - hairpins in a protein from multiple alignment information using ANN and SVM techniques [ J ]. Nucleic Acids Research ( Web - server - Issue), 2005 ( 33 ) : 154 - 159.
  • 6Hu XZ, Li QZ. Prediction of the β -hairpins in Proteins Using Support Vector Machine [ J ]. Protein J,2008,27 (2) :115 - 122.
  • 7Hu XZ, Li QZ ,Wang CL. Recognition of β-hairpin motifs in proteins by using the composite vector[ J]. Amino. Acids ,2010, 38(3) :915 -921.
  • 8Oliva A,Bates P A,Querol E,et al. An Automated Classification of the Structure of Protein Loops[ J]. J. Mol. Biol, 1997,266 (4) :814 -830.
  • 9Espadaler J, Fuentes N F, Hermoso A, et al. ArchDB :automated protein loop elassification as a tool for structural genomics [ J ]. Nucleie. Acids. Research ( Database Issue), 2004 (32) : 185 - 188.
  • 10Panek J, Eidhammer I, Aasland R. A new method for identification of protein (sub) families in a set of proteins based on hy- dropathy distribution in proteins [ J ]. PROTEINS: Strueture, Funetion, Bioinformatics ,2005,58 (4) :923 - 934.

二级参考文献123

共引文献134

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部