期刊文献+

融合堆叠自编码神经网络算法和全连接神经网络算法的化合物成药性预测模型 被引量:3

Prediction model of the probability of a lead compound becoming a drug based on Stacked AutoEncoder and Fully Connected Neural Network
原文传递
导出
摘要 目的:基于深度学习方法建立更加稳定、可靠、高实用性的化合物成药性预测模型。方法:通过Integrity, Chembl和DrugBank这3个数据库收集正、负样本数据,对正负样本大数据集进行数据清洗、解决数据不平衡问题之后,进一步对化合物的简化分子线性输入规范(SMILES)码进行标准化编码,在此基础上基于堆叠自编码神经网络算法(Stacked AutoEncoder, SAE)以及全连接神经网络算法(Fully Connected Neural Network, FCNN)构建并训练深度神经网络模型,对化合物进行特征提取,预测化合物的成药性。结果:模型最终稳定收敛,在验证集上准确率(ACC)和曲线下面积(AUC)分别达到0.995 3和0.992 7,较之前文献报道的基于机器学习的模型提高了约3%的预测精度。结论:基于大数据集和深度神经网络技术构建的化合物成药性预测模型具备一定的实用性,可以提高化合物成药性预测的精准度。 Objective: To build a more stable, reliable and practical model for the probability prediction of a lead compound becoming a drug based on the deep learning method. Methods: The positive and negative sample data sets were collected from Iintegrity, Chembl and Drugbank databases firstly. After cleaning the large data set of positive and negative samples and solving the problem of data imbalance, the compounds’ SMILES were further encoded. Then, Stacked AutoEncoder(SAE) and Fully Connected Neural Network(FCNN) were used to construct and train the deep neural network model to extract the features of the compounds and predict the probability of a lead compound becoming a drug. Results: The model finally converged stably, the ACC value and AUC value reached 0.995 3 and 0.992 9 respectively on the validation set, which improved the prediction accuracy by about 3% compared with the previously reported model based on machine learning. Conclusion: The prediction model based on large data set and deep neural network technology has certain practicability, and can improve the accuracy of the probability prediction of a lead compound becoming a drug.
作者 潘蕾 倪冰苇 赵鸿萍 PAN Lei;NI Bing-wei;ZHAO Hong-ping(School of Science,China Pharmaceutical University,Nanjing 211198,China)
出处 《中国新药杂志》 CAS CSCD 北大核心 2021年第14期1309-1315,共7页 Chinese Journal of New Drugs
基金 国家自然科学基金面上项目(81973512) 中国药科大学校级教学改革研究课题重点项目(3050050188)。
关键词 堆叠自编码神经网络 全连接神经网络 深度学习 SMILES码 成药性预测 Stacked AutoEncoder Fully Connected Neural Network deep learning SMILES probability prediction of a lead compound becoming a drug
  • 相关文献

参考文献2

二级参考文献8

共引文献5

同被引文献29

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部