期刊文献+

基于多策略滑动伸缩窗特征提取方法预测蛋白质同源寡聚体

PREDICTION OF PROTEIN HOMO-OLIGOMER TYPES WITH A NOVEL APPROACH OF GLIDE ZOOM WINDOW FEATURE EXTRACTION
原文传递
导出
摘要 寡聚蛋白质广泛地参与多种生命活动,对其预测研究有重要的意义。文章从蛋白质序列出发,提出多策略滑动伸缩窗特征提取方法,采用"一对一"的多类分类策略,对蛋白质同源寡聚体进行预测研究。结果表明,在Jackknife检验下,基于支持向量机的多策略滑动伸缩窗特征和氨基酸组成成分构成的特征集在加权情况下,其总分类精度最高达到了75.37%,比单纯的氨基酸组成成分法提高10.05%,比参考文献最好特征BG_Zhang提高了3.82%。说明多策略滑动伸缩窗特征提取方法对于蛋白质同源寡聚体分类,是一种非常有效的特征提取方法。 Protein homo-oligomers play an important role in varous life processes .The concept of multi-strategy glide zoom window was proposed and a novel approach of multi-strategy glide zoom window feature extraction was used for predicting protein homo-oligomers. Based on the concept of multi-strategy glide zoom window, the authors chose two strategy glide zoom windows: whole protein sequence glide zoom window and kin amino acid glide zoom window, and for each strategy glide zoom window, three feature vectors of amino acids distance sum, amino acids mean distance and amino acids distribution, were extracted. A series of feature sets were constructed by combining these feature vectors with amino acids composition to form pseudo amino acid compositions (PseAAC). The support vector machine (SVM) was used as base classifier. The 75.37% total accuracy is arrived in jackknife test in the weighted factor conditions, which is 10.05% and 3.82% higher than that of conventional amino acid composition method and that of BG Zhang in the same condition. The results show that multi-strategy glide zoom window method of extracting feature vectors from protein sequence is effective and feasible, and the feature vectors of multi-strategy glide zoom window may contain more protein structure information.
出处 《生物物理学报》 CAS CSCD 北大核心 2009年第5期335-342,共8页 Acta Biophysica Sinica
基金 国家自然科学基金项目(60775012 60634030) 西北工业大学科技创新项目(KC02)~~
关键词 同源寡聚体 支持向量机 特征提取 多策略滑动伸缩窗 多策略滑动伸缩窗特征 Homo-oligomers Support vector machines (SVM) Feature extraction Multi-strategy glide zoom window Multi-strategy glide zoom window features
  • 相关文献

参考文献14

  • 1Chou KC. Molecular therapeutic target for type2 diabetes. J Proteome Res, 2004,3:1284-1288.
  • 2Chou KC. Review: Structural bioinformatics and its impact to biomedical science. CurMed Chem, 2004,11:2105-2134.
  • 3Garian R. Prediction of quaternary structure from primary structure. Bioinformatics, 2001,17:551-556.
  • 4Chou KC, Cai YD. Predicting protein quaternary structure by pseudo amino acid composition. Proteins: Structure, Function,Genetics, 2003,53:282-289.
  • 5张绍武,潘泉,陈润生,张洪才.基于支持向量机的蛋白质同源寡聚体分类研究[J].生物化学与生物物理进展,2003,30(6):879-883. 被引量:15
  • 6Zhang SW, Quan P, Zhang HC, Wu YH, Shi JY. Support vector machines for predicting protein homo-oligomers by incorporating pseudo-amino acid composition, lnternet Electronic Journal of Molecular Design, 2003,2(6):392-402.
  • 7Zhang SW, Pan Q, Zhang HC, Shao ZC, Shi JY. Prediction protein homo-oligomer types by pseudo amino acid composition: approached with an improved feature extraction and Naive Bayes feature fusion. Amino Acids, 2006,30(4): 461-468.
  • 8张绍武,潘泉,赵春晖,程咏梅.基于加权自相关函数特征提取法的多类蛋白质同源寡聚体分类研究[J].生物医学工程学杂志,2007,24(4):721-726. 被引量:2
  • 9施建宇,潘泉,张绍武,程咏梅.基于氨基酸组成分布的蛋白质同源寡聚体分类研究[J].生物物理学报,2006,22(1):49-56. 被引量:9
  • 10Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta, 1975,405:442-451.

二级参考文献54

  • 1Garian R.Prediction of quaternary structure from primary structure.Bioinformatics,2001,17(6):551~556
  • 2Chou KC,Cai YD.Predicting protein quaternary structure by pseudo amino acid composition.PROTEINS:Strncture,Function,and Genetics,2003,53(2):282~289
  • 3Zhang SW,Quan P,Zhang HC,Wu YH,Shi JY.Support vector machines for predicting protein homo-oligomers by incorporating pseudo-amino acid composition.Internet Electronic Journal of Molecular Design,2003,2(6):392~402
  • 4Vapnik V.The nature of statistical learning theory.New York:Springer,1995.1~188
  • 5Brown M,Grundy W,Lin D,Cristianini N,Sugnet CW,Ares MJ,Furey TS,Haussler D.Knowledge-based analysis of microarray gene expression data by using support vector machines.Proceedings of the National Academy of Science USA,2000,97(1):262~267
  • 6Jaakkola T,Diekhans M,Haussler D.Using the fisher kernel method to detect remote protein homologies.In:Lengauer T,Schneider R,Bork P,Brutlag DL,Glasgow JI,Mewes HW,Zimmer Palf.Proceedings of the seventh international conference on intelligent systems for molecular biology.Menlo Park:AAAI Press,1999.149~158
  • 7Zien A,Ratsch G,Mika S,Scholkopf B,Lengauer T,Muller KR.Engineering support vector machine kernels that recognize translation initiation sites.Bioinformatics,2000,16(9):799~807
  • 8Cai YD,Liu XJ,Xu XB,Chou KC.Support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect.J Cell Biochem,2002,84 (2):343~348
  • 9Ding CH,Dubchak I.Multi-class protein fold recognition Using support vector machines and neural networks.Bioinformatics,2001,17(4):349~358
  • 10Kawashima S,Ogata H,Kanehisa M.AA index:amino acid index database.Nucleic Acids Research,1999,27(1):368~369

共引文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部