摘要
通过实验筛选研发新药的过程非常缓慢且需耗费大量的人力物力,而利用计算机辅助预测药物的分子性质可极大地节省药物研发时间和成本.因此,为了能够使抗乳腺癌候选药物对抑制ERα具有良好的生物活性和ADMET性质,针对收集到的1 974种化合物,首先利用随机森林分类器筛选出前20个对生物活性最具显著影响的分子描述符,并以此和pIC50值作为特征数据建立QSAR模型.其次,基于PSO优化BP神经网络对50个新化合物的生物活性值进行预测,模型拟合度为0.833 7,根均方误差为0.731 5,比优化前的BP神经网络预测值更贴合实际.随后为提高药物研发的成功率,依据已有的ADMET性质数据利用PSO优化SVM构建ADMET分类预测模型,算法交叉验证CV准确率达到94.076 7%,5个指标模型的预测准确率均在79%以上.结果表明,所建立的模型比基准模型的预测性能更好,采用的预测策略是有效的,可为抗乳腺癌药物的研发提供借鉴.
The process of screening and developing new drugs through experiments is very slow and requires a lot of manpower and material resources, and the use of computer-aided prediction of the molecular properties of drugs can greatly save time and cost of drug development.Therefore, in order to enable anti-breast cancer candidate drugs to have good biological activity and ADMET properties for inhibiting ERα,the random forest classifier was first used for the collected 1 974 compounds to screen the top 20 molecular descriptors with the most significant effects on biological activity.Then a QSAR model was established using this and pIC50 value as characteristic data.The biological activity values of 50 new compounds were predicted via the PSO optimized BP neural network, with the model fit of 0.833 7 and the root mean square error of 0.731 5,which were more consistent with the actual values than the predicted results of the BP neural network.Subsequently, in order to improve the success rate of drug development, the ADMET classification prediction model was constructed using PSO to optimize the SVM based on the existing ADMET property data.The algorithm cross-validation CV accuracy rate reached 94.076 7%,and the prediction accuracy rates of the five index models were all above 79%.The results show that the proposed model has better prediction performance than the benchmark model, and the adopted prediction strategy is effective, which can provide reference for the discovery and development of anti-breast cancer drugs.
作者
许美贤
郑琰
李炎举
吴伟豪
XU Meixian;ZHENG Yan;LI Yanju;WU Weihao(College of Automobile and Traffic Engineering,Nanjing Forestry University,Nanjing 210037)
出处
《南京信息工程大学学报(自然科学版)》
CAS
北大核心
2023年第1期51-65,共15页
Journal of Nanjing University of Information Science & Technology(Natural Science Edition)
基金
国家自然科学基金(71701099,71501090)
江苏省高等学校自然科学研究项目(17KJB580008)。
关键词
抗乳腺癌药物
生物活性
ADMET性质
粒子群优化算法
BP神经网络
支持向量机
anti breast cancer drugs
biological activity
ADMET properties
particle swarm optimization(PSO)
BP neural network
support vector machines(SVM)