摘要
针对蛋白质相互作用界面中的热点残基是局部紧凑地聚集着,而现有的基于机器学习的热点残基预测方法仅从目标残基中提取特征,并没有考虑目标残基的局部空间结构信息,以及如何进行特征提取并获得非冗余的特征子集等问题,为准确识别蛋白质相互作用界面的热点残基,提出结合蛋白质相互作用界面残基的空间邻近残基信息提取多类特征,并利用随机森林来进行特征提取,最后利用支持向量机来预测热点残基的方法.计算实验表明,该预测方法可以有效地用来发现热点残基.
Hot spots at protein interfaces were found to be clustered within locally and tightly packed regions. However, the existing machine learning based on hot spot prediction methods only gets features from the target residue, and does not consider the local spatial information of the target residue. Meanwhile, how to conduct the feature selection and obtain the sub- sets without redundant features should also be considered. In order to accurately identify hot spot residues at protein interfaces, this research tried to get various features by taking into consideration the spatial neighbor residues of each interface residue, and the feature selection was conducted by using random forests. Thereafter, the support vector machine was employed to predict the hot spots at protein interfaces. Computational experiments show that our prediction method can effec- tively discover hot spot residues.
出处
《天津科技大学学报》
CAS
北大核心
2015年第2期70-74,共5页
Journal of Tianjin University of Science & Technology
基金
天津市高等学校科技发展基金资助项目(20120803)
天津市科技支撑计划重点资助项目(12ZCZDGX02400)