摘要
蛋白质相互作用在生命活动中起着核心作用,蛋白质复合体亚基的结合是蛋白质相互作用的一种重要形式.由于实验方法耗时耗力,因此从理论上预测蛋白质相互作用有着特殊的意义.首先对蛋白质复合体的结合区统计其二级结构、超二级结构和二级结构组合的频数,以及这些结构占整个蛋白质氨基酸链中各类统计数目的百分比,得到各类型结构出现在结合区的相对倾向值.而后,根据相对倾向值给出相结合的两个蛋白亚基的氨基酸链的打分规则.选取64个氨基酸长度为标准滑动窗口,以1为步长,将每一对蛋白链得到一组分值作为特征值输入支持向量机,构建模型.用10折交叉检验评价其预测能力,得到整体准确率、敏感性、阳性预测值和相关系数分别是64.52%、66.87%,63.92%和0.291.这表明本方法有一定的区分度,用于预测蛋白质亚基相互作用是可行的.
Protein-protein interaction (PPI) plays a central role in biological activity. The combinding of subunits in complexes is one of the important forms in PPI. As the experimental methods are labor-intensive and time-consuming,PPI prediction in theory has a practical significance. The percentages of secondary structure, super-secondary structure and secondary structure pairing in the binding site of protein complex were calculated. The ratios of percentages of these three types of frequency with those in the entire amino acid chain were obtained. On the basis of that, score rules for the combined chains were given by a sliding window method of a 64 amino acid length using the relative bias mentioned above. The highest, lowest and average scores were extracted as feature value input into support vector machine to build the prediction models. By 10-fold cross validation, significant performance were achieved. The accuracy, sensitivity, positive predictive value, and correlation coefficient were 64. 52% , 66.87% , 63.92% and O. 291, respectively, revealling that this method could be used in PPI prediction.
出处
《内蒙古科技大学学报》
CAS
2010年第1期76-79,共4页
Journal of Inner Mongolia University of Science and Technology
基金
国家自然科学基金资助项目(60761001)
关键词
二级结构
超二级结构
蛋白质相互作用
支持向量机
secondary structure
super secondary structure
protein-protein interaction
support vector machine