期刊文献+

基于整合蛋白质进化保守性的伪氨基酸组成成分预测蛋白质亚细胞定位(英文) 被引量:2

USING PSEUDO AMINO ACID COMPOSITION TO PREDICT PROTEIN SUBCELLULAR LOCALIZATION:APPROACHED BY INCORPORATING EVOLUTIONARY CONSERVATION INFORMATION
原文传递
导出
摘要 蛋白质亚细胞定位信息对于确定蛋白质功能非常重要,它可以提供蛋白质在什么细胞环境下相互作用或与其它分子作用的信息,另外,如果知道蛋白质在细胞中的定位将有助于在细胞水平上理解复杂的蛋白质调控路径。面对后基因时代产生的海量蛋白质序列数据,迫切需要-些自动、快速、准确地确定蛋白质亚细胞定位的方法。为此,通过整合蛋白质进化保守信息,文章提出一种新的方法预测亚细胞定位。该方法基于Chou的伪氨基酸组成成分概念,应用改进的进化保守性算法计算蛋白质序列中每一个残基的保守值,从而使每一蛋白质序列可用基于小波多尺度能量而构建的特征向量来表示。另外,蛋白质序列还可用其它特征提取方法提取的特征向量来表示,如氨基酸组成成分、加权自相关函数和矩描述子。将这些特征向量输入到多类支持向量机分类器,通过积规则系统融合这四类特征分类器的分类结果。与他人结果相比,在Jackkife交叉验证下和独立样本测试下,该方法获得了较高的预测精度,说明提出的整合蛋白质进化保守性和多特征分类器融合思想,对于蛋白质亚细胞定位预测是有效的,可与现有方法互补。 Information of the subcellular locations of proteins is important because it can provide useful insights about their functions, as well as how and in what kind of cellular environments they interact with each other and with other molecules. Knowledge of the localization of proteins within cellular compartments can help understand the intricate pathways that regulate biological processes at the cellular level. Facing the explosion of newly generated protein sequences in the post genomic era, developing an automated method for fast and reliably annotating their subcellular locations is becoming more and more important. Here, a novel approach was developed by incorporating protein evolutionary conservation information. Based on the concept of Chou's pseudo amino acid composition (PseAAC) and per residue conservation score calculated with an improved evolutionary conservation algorithm, each protein can be represented as a feature vector created with multi-scale energy (MSE). In addition, the protein can be represented as other feature vectors based on amino acid composition (AAC), weighted auto-correlation function and Moment descriptor methods. Then, the feature vectors of all protein sequences are further input into multi-class support vector machines to predict 12 kinds of subcellular locations. Finally, the results of four kinds of feature classifiers were fused through a product rule system. Compared with the results reported by the previous investigators, higher success rates were obtained in both jackknife cross-validation test and independent dataset test, suggesting that introducing protein evolutionary information and the concept of fusing multi-features classifiers are quite encouraging and promising, and may become a useful tool in complementing the existing methods.
出处 《生物物理学报》 CAS CSCD 北大核心 2009年第2期125-132,共8页 Acta Biophysica Sinica
基金 supported by a grant from The Young College Teachers Projects in Henan Province (2007-335)
关键词 进化信息 多尺度能量 加权自相关函数 矩描述子 融合 亚细胞定位 Evolutionary information Multi-scale energy Weighted auto-correlation function Moment descriptor Fuse Subcellular location
  • 相关文献

参考文献3

二级参考文献36

共引文献19

同被引文献22

  • 1李立奇,万瑛.蛋白质的亚细胞定位预测研究进展[J].免疫学杂志,2009,25(5):602-604. 被引量:11
  • 2姜小莹,李晓波.基于伪氨基酸和支持向量机的蛋白质亚细胞定位预测[J].广西农业生物科学,2006,25(4):349-352. 被引量:3
  • 3CRISTINAINI N,SHWAE-TAYLDR J.支持向量机导论[M].李国正,王猛,曾华军,译.北京:电子工业出版社,2004:82-108.
  • 4Nakai K, Kanehisa M. A knowledge base for predictingprotein localization sites in eukaryotic cells [J]. Genomics, 1992(14) :897-911.
  • 5Cao Y, Liu S, Zhang L et al. Prediction of protein structural class with Rough Sets [J]. BMC Bioinformatics, 2006(7):20-25.
  • 6Yuan Z. Prediction of protein subcellular locations using Markov chain models [J]. FEBS Left, 1999, 451:23-26.
  • 7kuo-chen chou, Hong-bin shen. Recent progress in protein subcellular location prediction [J]. Analytical Biochemistry, 2007370:1-16.
  • 8kuo-chen chou, Hongobin shen, Hum-PLoc. A novel ensemble classifier for predicting human protein- subcellular localization [J]. Biochemical and Biophysical Research Communications, 2006, 347:150-157.
  • 9Vapnik V. The nature of statistical learning theory[M]. NewYork: Springer, 1995:88.
  • 10Hsu C W, Lin C J. A comparison of methods for multi-class spport vector machines. IEEE Transactions in Neural Networks, 2002, 13(2):415-425.

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部