摘要
基于不同亚细胞位置中蛋白质的氨基酸组成及序列信息不同这一观点,以单个氨基酸含量及两两组合氨基酸含量为信息构成离散源,分别计算了原核生物蛋白质三类亚细胞位置的标准离散量D(Xe),D(Xp),D(Xc).利用离散增量的概念预测蛋白质的亚细胞位置,它是由这个蛋白质的离散量D(X)与三个标准离散量D(Xe),D(Xp),D(Xc)之间离散增量的最小值所决定的.采用Self-consistency检验和Jack-knife检验方法,给出了选择五组不同信息作为离散源中参数时的预测结果.与现有的方法比较,发现用Jack-knife检验法预测extracellular类蛋白质时,给出的离散量方法能够给出最好的预测性能,结果也表明提取更多有效的序列信息是提高预测精度的关键.
Based on the difference of amino acid composition and sequence information with different subcellular locations,the subcellular location of a protein can be predicted by using of the increment of diversity between the proteins and a set of standard set of proteins.The standard sources of diversity are determined by amino acid composition and residuepair content respectively.The three increments of diversity between the standard measure of diversity D(Xe),D(Xp),D(Xc) and a measure of diversity D(X) of a new protein are respectively calculated.The subcellular location of a protein is determined by the lowest increment of diversity.The prediction results are given by the selfconsistency test and the jackknife test with five types of different information as the paramenters of the diversity sources.Compared with the existing algorithms,the measure of diversity can give the best prediction quality for the extracellular protein location from prokaryotic cells by the jackknife test.The results show that the selection of more useful information from the primary protein sequence is the key to improve the prediction accuracy.
出处
《内蒙古大学学报(自然科学版)》
CAS
CSCD
北大核心
2003年第5期510-517,共8页
Journal of Inner Mongolia University:Natural Science Edition
基金
国家自然科学基金资助项目(30160025)
内蒙古大学青年科学基金资助项目(202045).
关键词
蛋白质亚细胞位置
原核生物
离散量
离散增量
proteins subcellular location
prokaryotic organism
measure of diversity
increment of diversity