期刊文献+

Science Letters:EHPred: an SVM-based method for epoxide hydrolases recognition and classification 被引量:1

Science Letters:EHPred: an SVM-based method for epoxide hydrolases recognition and classification
下载PDF
导出
摘要 A two-layer method based on support vector machines (SVMs) has been developed to distinguish epoxide hydrolases (EHs) from other enzymes and to classify its subfamilies using its primary protein sequences. SVM classifiers were built using three different feature vectors extracted from the primary sequence of EHs: the amino acid composition (AAC), the dipeptide composition (DPC), and the pseudo-amino acid composition (PAAC). Validated by 5-fold cross tests, the first layer SVM clas- sifier can differentiate EHs and non-EHs with an accuracy of 94.2% and has a Matthew’s correlation coefficient (MCC) of 0.84. Using 2-fold cross validation, PAAC-based second layer SVM can further classify EH subfamilies with an overall accuracy of 90.7% and MCC of 0.87 as compared to AAC (80.0%) and DPC (84.9%). A program called EHPred has also been developed to assist readers to recognize EHs and to classify their subfamilies using primary protein sequences with greater accuracy. A two-layer method based on support vector machines (SVMs) has been developed to distinguish epoxide hydrolases (EHs) from other enzymes and to classify its subfamilies using its primary protein sequences. SVM classifiers were built using three different feature vectors extracted from the primary sequence of EHs: the amino acid composition (AAC), the dipeptide composition (DPC), and the pseudo-amino acid composition (PAAC). Validated by 5-fold cross tests, the first layer SVM classifter can differentiate EHs and non-EHs with an accuracy of 94.2% and has a Matthew's correlation coefficient (MCC) of 0.84. Using 2-fold cross validation, PAAC-based second layer SVM can further classify EH subfamilies with an overall accuracy of 90.7% and MCC of 0.87 as compared to AAC (80.0%) and DPC (84.9%). A program called EHPred has also been developed to assist readers to recognize EHs and to classify their subfamilies using primary protein sequences with greater accuracy.
出处 《Journal of Zhejiang University-Science B(Biomedicine & Biotechnology)》 SCIE CAS CSCD 2006年第1期1-6,共6页 浙江大学学报(英文版)B辑(生物医学与生物技术)
基金 Project (No. 20542006) supported by the National Natural ScienceFoundation of China
关键词 Epoxide hydrolases (EHs) Amino acid composition (AAC) Dipeptide composition (DPC) Pseudo-amino acid composition (PAAC) Support vector machines (SVM) Epoxide hydrolases (EHs), Amino acid composition (AAC), Dipeptide composition (DPC), Pseudo-amino acid composition (PAAC), Support vector machines (SVM)
  • 相关文献

参考文献1

  • 1S. D. Varfolomeev,I. V. Uporov,E. V. Fedorov.Bioinformatics and Molecular Modeling in Chemical Enzymology. Active Sites of Hydrolases[J].Biochemistry (Moscow).2002(10)

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部