摘要
提出一种使用PSI-BLAST得到的位置特异性打分矩阵中蕴含的进化信息作为酶蛋白的特征表示,结合支持向量机方法对酶蛋白的亚家族类别进行预测的方法.对包含16类亚家族的2 640条氧化还原酶数据集进行jacknife测试,总的预测精度达到92.12%,高于目前的任何其他预测方法.实验结果表明,进化信息是酶蛋白序列的有效表示,将其与支持向量机结合能够实现对酶蛋白亚家族的高精度预测.
A novel method was proposed to predict enzyme subfamily classes. It combined support vector machines (SVMs) and evolutionary information of amino acid sequences in the form of position-specific scoring matrix (PSSM) by PSI-BLAST. With a jackknife test on a widely used dataset that containing 2 640 oxidoreductase sequences classified into 16 subfamily classes, the proposed method achieved a high overall accuracy of 92. 12%, which is much better than that of any previous method. The results indicate that evolutionary information has a strong correlation with enzyme types and the proposed method is a potential powerful tool for enzyme subfamily classification.
基金
教育部研究生创新基金(C07-05)
中国科技大学高水平大学建设重点科研基金资助
关键词
酶蛋白亚家族预测
进化信息
支持向量机
位置特异性打分矩阵
enzyme subfamily classification
evolutionary information
support vector machines
positionspecific scoring matrix