摘要
G-protein coupled receptors(GPCRs) have a relatively conservative seven transmembrane helix(7tm) regions, and their N and C termini are various. In order to strengthen the features of GPCR families, N and C termini were removed in this study, then frequency features in the form of single amino acid and dipeptide compositions for recognition of human GPCRs were analyzed and extracted based on the compressed amino acid alphabets. Based on these features, classifiers were developed using support vector machine(SVM). The ability of different compressed methods was investigated. The testing results demonstrated that the suitable choice of compressed method combined with amino acid composition information could get good performance for the recognition of human GPCRs.
G-protein coupled receptors(GPCRs) have a relatively conservative seven transmembrane helix(7tm) regions, and their N and C termini are various. In order to strengthen the features of GPCR families, N and C termini were removed in this study, then frequency features in the form of single amino acid and dipeptide compositions for recognition of human GPCRs were analyzed and extracted based on the compressed amino acid alphabets. Based on these features, classifiers were developed using support vector machine(SVM). The ability of different compressed methods was investigated. The testing results demonstrated that the suitable choice of compressed method combined with amino acid composition information could get good performance for the recognition of human GPCRs.
基金
Supported by the Natural Science Foundation of Ningxia University(ZR1124)