摘要
预测蛋白质的亚细胞定位信息对于了解其功能有重要的意义。选择氨基酸组成、氨基酸对组成、位置特异性打分矩阵三种分类特征以及模糊k近邻、支持向量机两种预测方法,分别进行了测试。对预测结果的分析显示,位置特异性打分矩阵可以提高对不同亚细胞器的可区分性;而支持向量机可以更好地利用位置特异性打分矩阵特征进行预测。使用氨基酸组成和位置特异性打分矩阵两种特征,并结合支持向量机,是一种有效的亚细胞定位预测方法。
Prediction of protein subcellular location is one of the key functional characters to understand its biological function. Totally three kinds of input features were investigated in this paper, i. e., amino acid composition, amino acid pair composition and position-specific scoring matrix (PSSM). In addition, the fuzzy k-NN and support vector machine (SVM) were employed which are more suitable for this purpose. Comprehensive comparison of prediction results on several data sets shows that PSSM is better than the other two features. SVM, a novel machine learning based on statistical learning theory, can make better use of PSSM than fuzzy k-NN method. Finally, the best prediction performance can be achieved by adopting both PSSM and amino acid composition as input feature and SVM for prediction.
出处
《北京生物医学工程》
2006年第6期649-653,657,共6页
Beijing Biomedical Engineering
基金
中国科学技术大学知识创新工程重大项目资助