Based on the concept of the pseudo amino acid composition (PseAAC), protein structural classes are predicted by using an approach of increment of diversity combined with support vector machine (ID-SVM), in which t...Based on the concept of the pseudo amino acid composition (PseAAC), protein structural classes are predicted by using an approach of increment of diversity combined with support vector machine (ID-SVM), in which the dipeptide amino acid composition of proteins is used as the source of diversity. Jackknife test shows that total prediction accuracy is 96.6% and higher than that given by other approaches. Besides, the specificity (Sp) and the Matthew's correlation coefficient (MCC) are also calculated for each protein structural class, the Sp is more than 88%, the MCC is higher than 92%, and the higher MCC and Sp imply that it is credible to use ID-SVM model predicting protein structural class. The results indicate that: 1 the choice of the source of diversity is reasonable, 2 the predictive performance of IDSVM is excellent, and3 the amino acid sequences of proteins contain information of protein structural classes.展开更多
基金Supported by the National Natural Science Foundation of China (30660044)
文摘Based on the concept of the pseudo amino acid composition (PseAAC), protein structural classes are predicted by using an approach of increment of diversity combined with support vector machine (ID-SVM), in which the dipeptide amino acid composition of proteins is used as the source of diversity. Jackknife test shows that total prediction accuracy is 96.6% and higher than that given by other approaches. Besides, the specificity (Sp) and the Matthew's correlation coefficient (MCC) are also calculated for each protein structural class, the Sp is more than 88%, the MCC is higher than 92%, and the higher MCC and Sp imply that it is credible to use ID-SVM model predicting protein structural class. The results indicate that: 1 the choice of the source of diversity is reasonable, 2 the predictive performance of IDSVM is excellent, and3 the amino acid sequences of proteins contain information of protein structural classes.