摘要
Prediction of protein functions from known genomic sequences is an important mission of bioinformatics. One approach is to classify proteins into functional catego- ries. We have therefore developed a method based on protein domain composition and the maximum likelihood estimation (MLE) algorithm to classify proteins according to functions. Using the Saccharomyces cerevisiae genome, we compared the effectiveness of the MLE approach with that of an intui- tive and simple method. The MLE method outperformed the simple method, achieving an estimated specificity of 75.45% and an estimated sensitivity of 40.26%. These results indicate that domain is an important feature of proteins and is closely related to protein function.
Prediction of protein functions from known genomic sequences is an important mission of bioinformatics. One approach is to classify proteins into functional catego- ries. We have therefore developed a method based on protein domain composition and the maximum likelihood estimation (MLE) algorithm to classify proteins according to functions. Using the Saccharomyces cerevisiae genome, we compared the effectiveness of the MLE approach with that of an intui- tive and simple method. The MLE method outperformed the simple method, achieving an estimated specificity of 75.45% and an estimated sensitivity of 40.26%. These results indicate that domain is an important feature of proteins and is closely related to protein function.
关键词
蛋白质
生物功能
域方法
最大可能性估计算法
期望最大化
protein function prediction,maximum likelihood estima- tion,expectation maximization,domain.