摘要
蛋白质亚细胞定位预测对蛋白质的功能、相互作用及调控机制的研究具有重要意义。本文基于物化性质和结构性质对氨基酸的约化,描述序列局部和全局信息的"组成"、"转换"和"分布"特征,并利用氨基酸亲疏水性的数值统计特征,提出了一种新的蛋白质特征表示方法(NSBH)。分别使用三种分类器KNN、SVM及BP神经网络进行蛋白质亚细胞定位预测,比较了几种方法和特征融合方法的预测结果,显示融合特征表示及结合SVM分类器时能够达到更好的预测准确率。同时,还详细讨论了不同参数对实验结果的影响,具体的实验及比较结果显示了该方法的有效性。
The protein subcellular localization prediction is important to study the protein function,protein interaction and their regulation mechanism. In this paper,based on four amino acids physicochemical properties and structural properties,We describe the local and global information of sequence by ‘component', ‘transition ' and‘distribution'. Using the numerical statistical characteristic of hydrophobic/hydrophilic amino acid,we proposed a new protein feature representation. We compare the prediction results between the proposed methods and fusion method with the classification algorithm KNN,SVM and BP. The results show that fusion method with SVM can get better prediction accuracies. Meantime,we also discuss the effects of different parameters on the experimental results. The detailed experimental and comparison results show the effectiveness of the proposed method.
出处
《生物信息学》
2015年第2期103-110,共8页
Chinese Journal of Bioinformatics
基金
国家自然基金项目(61272312)资助
关键词
蛋白质亚细胞定位
氨基酸物化性质
支持向量机
Subcellular localization
Physicochemical properties
Support vector machine(SVM)