摘要
基于一种新的特征提取方法——分组重量编码(Encoding on the basis of Grouped Weight,简记为EBGW),采用组分耦合算法作为分类器,从蛋白质一级序列出发对四类同源寡聚体蛋白进行分类研究。结果表明,在Jackknife检验下,基于分组重量编码的分类方法总体分类精度达到70.92%,比基于氨基酸组成和加权伪氨基酸成分特征提取方法分别提高20.28和7.53个百分点,说明分组重量编码对于蛋白质同源寡聚体分类是一种高效的特征提取方法。
The homo-dimer, homo-trimer, homo-tetramer, homo-hexamer of proteins were classified with a new encoding method-- encoding on the basis of grouped weight (EBGW) and component-couple algorithm. It was found that the overall classification accuracy by Jackknife test was 70.92%, which was 20.28 and 7.53 percentile higher than that of the amino acid composition and that of the weight pseudo-amino acid composition method with the same classifying algorithm on the same data set. The results indicate that EBGW method has reached a satisfying performance despite its simplicity.
出处
《国防科技大学学报》
EI
CAS
CSCD
北大核心
2007年第2期91-93,共3页
Journal of National University of Defense Technology
关键词
分组重量编码
同源寡聚体
组分耦合算法
encoding on the basis of grouped weight
homo-oligomer
component-coupled algorithm