摘要
酶作为一种重要的生物催化剂在生物代谢过程中扮演着非常重要的角色。一种酶的功能与它所属的类或子类有着密切的关系。所以,不论是在基础研究的过程中还是药物发现的过程中,研究预测酶的分类方法都显得非常有用。通过采用一种基于伪氨基酸组成作为酶序列的特征向量,同时又加入了更多的氨基酸信息,来对酶进行分类。对于分类器,考虑到它是多分类问题,采用了最优证据理论-K近邻算法。实验结果证明这样做是有效的,达到83%的准确率。
Enzyme plays an important role in biological metabolism pathways as catalyzer. Furthermore, the function of an enzyme has close relationship with subfamily it belongs to. Thus, the enzyme class problem becomes useful in biology. When constructing its feature vector, the pseudo-amino acids mode is incited, combining the components of the amino acid pair and more useful biophysical features. At the same time, the nice multi-class classifier is chosen: Optimized Evidence Theory-K Nearest Neighbors (OET-KNN) to train these feature vectors. The classifying performance reaches as high as 83 %.
出处
《计算机工程与应用》
CSCD
2013年第9期123-126,共4页
Computer Engineering and Applications
关键词
特征向量
伪氨基酸模型
最优证据理论-K近邻算法
feature vector
pseudo-amino acids mode
Optimized Evidence Theory-K Nearest Neighbors' (OET-KNN) algorithm