摘要
抗癌药物的定量结构-活性关系(QSAR)预测模型的研究有助于对癌症病人药物治疗的靶点进行预测和优化.首先综合XGBoost、随机森林和MIC筛选出与生物活性最重要的20个变量,其次,采用遗传算法对多个机器学习模型进行超参优化,最后将优化后的模型采用神经网络进行非线性组合,通过与多种机器学习模型以及组合模型的预测效果进行比较,发现此非线性组合优化模型具有最好的预测效果且稳健性检验结果表明本文模型能保持预测精度的稳定性。
Quantitative structure-activity relationship(QSAR)prediction models of anticancer drugs are helpful to predict and optimize the targets of drug therapy in cancer patients.This paper first integrates XGBoost,random forest and MIC to screen out the 20 most important variables related to biological activity.Then,a genetic algorithm is used to optimize the hyperparameters of multiple machine learning models.Finally,the optimized model is nonlinearly combined using neural network.By comparing the prediction effects of various machine learning models and combination models,it is found that the nonlinear combination optimization model has the best prediction effect,and the robustness test results show that the model in this paper can maintain the stability of the prediction accuracy.
作者
陈裕友
王琴
胡静匀
曾杏元
CHEN Yu-you;WANG Qin;HU Jing-yun;ZENG Xing-yuan(School of Mathematics and Statistics,Hunan Normal University,Changsha 410081,China)
出处
《数学的实践与认识》
2022年第10期184-190,共7页
Mathematics in Practice and Theory
关键词
QSAR
特征筛选
超参优化
非线性组合
QSAR
feature selection
hyperparameter optimization
nonlinear combination