期刊文献+

基于次生特征提取方法预测蛋白质同源寡聚体

Prediction of Protein Homo-Oligomer Types With a Novel Approach of Secondary Feature Extraction
下载PDF
导出
摘要 寡聚蛋白质相对于单体蛋白质具有许多优势,广泛地参与多种生命活动。本文提出次生特征提取方法,使用支持向量机作为分类器,采用"一对一"的多类分类策略,基于蛋白质一级序列提取特征方法,对四类同源寡聚体进行分类研究。结果表明,在Jackknife检验下,基于次生特征和氨基酸组成成分特征构成的特征集,加权情况下,其总分类精度最高达到了78.41%,比氨基酸组成成分特征提高13.09%,比参考文献最好特征集BG提高了6.86%,比最好原生特征集CM1提高了5.53%。此结果说明次生特征提取方法对于蛋白质同源寡聚体分类是一种非常有效的特征提取方法。 Protein homo-oligomers play an important role in various life processes. The secondary feature extraction method was proposed and used for predicting protein homo-oligomers. Processing primary features by statistical methods to increase the distance among primary features, secondary feature can be obtained. The support vector machine ( SVM ) was used as base classifier. The 78.41% total accuracy was arrived in jackknife test in the weighted factor conditions, which was 13.09% ,6.86% and 5.53% higher than those of conventional amino acid composition methods, that of the reference feature set BG and that of the best primary feature set CM1 in same condition respectively. The experimental results showed that the secondary feature extraction method is effective to increase the distance among primary features and improved the classification prediction performance.
出处 《北京生物医学工程》 2010年第1期16-22,共7页 Beijing Biomedical Engineering
基金 国家自然科学基金(60775012 60634030) 西北工业大学科技创新项目(KC02)资助
关键词 同源寡聚体 支持向量机 特征提取 原生特征 次生特征 homo-oligomers support vector machines ( SVM ) feature extraction primary feature secondary feature
  • 相关文献

参考文献16

  • 1Chou KC. MolecuLar therapeutic target for type 2 diabetes. J Proteome Res, 2004, 3 : 1284 - 1288.
  • 2Chou KC. Review: Structural bioinformaties and its impact to biomedical science. Cur Med Chem, 2004, 11 : 2105 - 2134.
  • 3Garian R. Prediction of quaternary structure from primary structure. Bioinformaties, 2001, 17:551 -556.
  • 4Chou KC, Cai YD. Predicting protein quaternary structure by pseudo amino acid composition. Proteins: Structure, Function,Genetics,2003, 53: 282- 289.
  • 5张绍武,潘泉,陈润生,张洪才.基于支持向量机的蛋白质同源寡聚体分类研究[J].生物化学与生物物理进展,2003,30(6):879-883. 被引量:15
  • 6Zhang SW, Quan P, Zhang HC,et al. Support vector machines for predicting protein homo-oligomers by incorporating pseudoamino acid composition. Internet Electronic Journal of Molecular Design,2003,2(6) :392 -402.
  • 7Zhang SW, Pan Q, Zhang HC,et al. Prediction Protein Homooligomer Types by Pesudo Amino Acid Composition: Approached with an Improved Feature Extraction and Naive Bayes Feature Fusion, Amino Acids, 2006, 30(4):461 -468.
  • 8张绍武,潘泉,赵春晖,程咏梅.基于加权自相关函数特征提取法的多类蛋白质同源寡聚体分类研究[J].生物医学工程学杂志,2007,24(4):721-726. 被引量:2
  • 9施建宇,潘泉,张绍武,程咏梅.基于氨基酸组成分布的蛋白质同源寡聚体分类研究[J].生物物理学报,2006,22(1):49-56. 被引量:9
  • 10Li Qipeng, Zhang Shaowu, Pan Quan. Using multi-scale glide zoom window feature extraction approach to predict protein homol oligomer types. 3rd IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2008, v 5265 LNBI: 78 - 86.

二级参考文献54

  • 1Garian R.Prediction of quaternary structure from primary structure.Bioinformatics,2001,17(6):551~556
  • 2Chou KC,Cai YD.Predicting protein quaternary structure by pseudo amino acid composition.PROTEINS:Strncture,Function,and Genetics,2003,53(2):282~289
  • 3Zhang SW,Quan P,Zhang HC,Wu YH,Shi JY.Support vector machines for predicting protein homo-oligomers by incorporating pseudo-amino acid composition.Internet Electronic Journal of Molecular Design,2003,2(6):392~402
  • 4Vapnik V.The nature of statistical learning theory.New York:Springer,1995.1~188
  • 5Brown M,Grundy W,Lin D,Cristianini N,Sugnet CW,Ares MJ,Furey TS,Haussler D.Knowledge-based analysis of microarray gene expression data by using support vector machines.Proceedings of the National Academy of Science USA,2000,97(1):262~267
  • 6Jaakkola T,Diekhans M,Haussler D.Using the fisher kernel method to detect remote protein homologies.In:Lengauer T,Schneider R,Bork P,Brutlag DL,Glasgow JI,Mewes HW,Zimmer Palf.Proceedings of the seventh international conference on intelligent systems for molecular biology.Menlo Park:AAAI Press,1999.149~158
  • 7Zien A,Ratsch G,Mika S,Scholkopf B,Lengauer T,Muller KR.Engineering support vector machine kernels that recognize translation initiation sites.Bioinformatics,2000,16(9):799~807
  • 8Cai YD,Liu XJ,Xu XB,Chou KC.Support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect.J Cell Biochem,2002,84 (2):343~348
  • 9Ding CH,Dubchak I.Multi-class protein fold recognition Using support vector machines and neural networks.Bioinformatics,2001,17(4):349~358
  • 10Kawashima S,Ogata H,Kanehisa M.AA index:amino acid index database.Nucleic Acids Research,1999,27(1):368~369

共引文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部