The research methods of protein structure prediction mainly focus on finding effective features of protein sequences and developing suitable machine learning algorithms. But few people consider the importance of weigh...The research methods of protein structure prediction mainly focus on finding effective features of protein sequences and developing suitable machine learning algorithms. But few people consider the importance of weights of features in classification. We propose the GASVM algorithm (classification accuracy of support vector machine is regarded as the fitness value of genetic algorithm) to optimize the coefficients of these 16 features (5 features are proposed first time) in the classification, and further develop a new feature vector. Finally, based on the new feature vector, this paper uses support vector machine and 10-fold cross-validation to classify the protein structure of 3 low similarity datasets (25PDB, 1189, FC699). Experimental results show that the overall classification accuracy of the new method is better than other methods.展开更多
预测含伪结的RNA分子二级结构是生物信息学的一个研究难点。利用多分类支持向量机结合贝叶斯神经网络针对含伪结的RNA分子二级结构进行预测。利用多分类支持向量机进行预测,输出端得到相应碱基的平面伪结结构的E-NSSEL(Ex-tend New Seco...预测含伪结的RNA分子二级结构是生物信息学的一个研究难点。利用多分类支持向量机结合贝叶斯神经网络针对含伪结的RNA分子二级结构进行预测。利用多分类支持向量机进行预测,输出端得到相应碱基的平面伪结结构的E-NSSEL(Ex-tend New Secondary Structure Element Label)类别标签。使用碱基已预测的结果通过贝叶斯神经网络进行修正,并恢复RNA分子二级结构。使用该方法能有效地改善含伪结的RNA分子二级结构的预测效果。展开更多
文摘The research methods of protein structure prediction mainly focus on finding effective features of protein sequences and developing suitable machine learning algorithms. But few people consider the importance of weights of features in classification. We propose the GASVM algorithm (classification accuracy of support vector machine is regarded as the fitness value of genetic algorithm) to optimize the coefficients of these 16 features (5 features are proposed first time) in the classification, and further develop a new feature vector. Finally, based on the new feature vector, this paper uses support vector machine and 10-fold cross-validation to classify the protein structure of 3 low similarity datasets (25PDB, 1189, FC699). Experimental results show that the overall classification accuracy of the new method is better than other methods.
文摘预测含伪结的RNA分子二级结构是生物信息学的一个研究难点。利用多分类支持向量机结合贝叶斯神经网络针对含伪结的RNA分子二级结构进行预测。利用多分类支持向量机进行预测,输出端得到相应碱基的平面伪结结构的E-NSSEL(Ex-tend New Secondary Structure Element Label)类别标签。使用碱基已预测的结果通过贝叶斯神经网络进行修正,并恢复RNA分子二级结构。使用该方法能有效地改善含伪结的RNA分子二级结构的预测效果。