摘要
细胞凋亡蛋白对生物体的发育、维持内环境稳定及人们理解细胞凋亡机制非常重要。文中提出了一种新的蛋白质序列特征提取方法—三肽离散源方法。计算了蛋白质序列中紧邻三联体的出现个数,利用离散增量极小化对凋亡蛋白进行定位预测;同时推广了张春霆等提出的内容平衡精度指数,使其能评估任意类的分类问题。实验结果表明:在凋亡蛋白定位预测研究中,三肽离散源方法在提高总体预测精度的同时,能够较好的解决样本不均衡问题;而内容平衡精度指数能比传统的总体预测精度更准确的评估预测算法的预测能力,有效的反映预测算法对样本不均衡问题的相容能力。
Apoptosis proteins play a central role in the development and homeostasis of an organism. These proteins are very important for understanding the mechanism of programmed cell death. A new encoding method based on tri-polypeptide composition is presented. By use of adjacent triune residues contents in the protein primary sequences, the increment of diversity is calculated to predict the subcellular location of apoptosis proteins. The content-balancing accuracy index presented by Zhang CT is extended to solve any classification problem. The experiment results show that for apoptosis protein subcellular location prediction, the method of tri-polypeptide diversity source can not only improve the overall prediction accuracy, but also solve the imbalance problem of samples. While the content-balancing accuracy index is much superior to the widely used overall prediction accuracy for evaluating prediction algorithms.
出处
《激光生物学报》
CAS
CSCD
2007年第2期249-252,F0003,共5页
Acta Laser Biology Sinica
基金
国家自然科学基金资助(60603054)
关键词
细胞凋亡蛋白
亚细胞定位
三肽组成
离散增量
apoptosis protein
subcellular location
tri-polypeptide composition
increment of diversity