期刊文献+

基于自相关系数和PseAAC的蛋白质结构类预测 被引量:4

Protein Structure Class Prediction Based on Autocorrelation Coefficient and PseAAC
下载PDF
导出
摘要 传统的预测方法在构造特征向量时只考虑了氨基酸的组成,而自相关系数不仅能够很好地反映序列中氨基酸的位置信息,而且考虑了序列内部不同位置的氨基酸间的相互影响。设计了一种将氨基酸组成和自相关系数相结合的方法来构造特征向量;在Chou提出的伪氨基酸组成模型(pseudo-amino acid composition,PseAAC)的基础上,通过扩展信息重新构造了伪氨基酸组成模型,并将其与自相关系数组合在一起来构造特征向量。分别使用两种方法编码,选用支持向量机作为预测工具,在数据集Z277、Z498以及独立测试集D138上进行了若干实验,对比结果显示,新方法比传统的氨基酸组成方法的准确率分别平均提高了7.43%和8.53%,证明了新方法是有效的。 In the traditional prediction methods, only the composition of amino acids was taken into account in con- strutting feature vector. While both the position and interaction of the amino acids which are at the different loca- tions can be reflected well by the correlation coefficient. Firstly, this paper designs a method which combines amino acid composition and correlation coefficient. Secondly, on the basis of the pseudo-amino acid composition (PseAAC) model proposed by Chou, this paper reconstructs the PseAAC model by extending the information, and combines the PseAAC model and autocorrelation coefficient to construct feature vector. Using the two new methods for coding, several experiments are conducted on the datasets Z277, Z498 and the independent test sets D138 with the prediction tool of support vector machine. The experimental comparison results show that the accuracy of the new method can improve 7.43% and 8.53% on average than the traditional amino acid composition method, which proves that the new method is more effective.
出处 《计算机科学与探索》 CSCD 2014年第1期103-110,共8页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金Nos.61073117 61175046 61203290 安徽大学博士科研启动经费No.33190078~~
关键词 蛋白质结构类预测 自相关系数 伪氨基酸组成(PseAAC) 支持向量机(SVM) protein structure class prediction autocorrelation coefficient pseudo-amino acid composition (PseAAC) support vector machine (SVM)
  • 相关文献

参考文献1

二级参考文献28

  • 1闫化军,傅彦,章毅,李毅超.神经网络方法预测蛋白质二级结构[J].计算机科学,2003,30(11):48-52. 被引量:4
  • 2Ben-Gal I,Shani A,Gohr A, et al. Identification of transcription factor binding sites with variable-order bayesian networks[J]. Bioinformaties, 2005,21(11) : 2657-2666.
  • 3Wootton J C, Federhen S. Analysis of eompositionally biased regions in sequence databases[J]. Methods Enzyrnol, 1996,266: 554- 571.
  • 4Kabsch W, Sander C. Dictionary of protein secondary structure:pattern recognition of hydrogen-bonded and geometrical features[J]. Biopolymers, 1983,22 (12) : 2577-637.
  • 5Yang A S, Wang L Y. Local structure prediction with local structure-based sequence profiles [J ]. Bioinformatics, 2003,19 : 1267-1274.
  • 6Qian N, Sejnowski T J. Predicting the secondary structure of globular proteins using neural network models[J]. Journal of Molecular Biology, 1988,202:865-884.
  • 7Hua S J, Sun Z R. A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach[J]. J Mol Biol, 2001,308 (2) : 397-407.
  • 8Karplus K, Karchin R, Draper J, et al. Combining local - struc - ture,fold-recognition, and new fold methods for protein structure prediction[J]. Proteins,2003,53(6) :491-496.
  • 9Li J Y,Wong L S, Yang Q. Data mining in Bioinformatics[J]. IEEE Intelligent Systems, 2005,20(6) : 16-18.
  • 10Smith T F, Waterman M S. Identification of common molecular subsequences[J]. J Mol Biol, 1981,147(1) : 195-197.

共引文献3

同被引文献17

引证文献4

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部