摘要
目的:构建一种基于深度神经网络的“药物-靶点”亲和力预测方法。方法:先对拟定药物进行独热编码,并使用预训练语言表征模型对靶蛋白进行编码,以捕获氨基酸序列中的重要信息,然后设计2个独立的卷积神经网络,通过4个全连接层来预测“药物-靶点”的亲和力。最后在Davis激酶结合亲和力数据集和KIBA大规模激酶抑制剂生物活性数据集上验证本方法的性能,并将实验结果与KronRLS、SimBoost、DeepDTA算法结果进行比较。结果:相较于KronRLS、SimBoost、DeepDTA算法,本方法在Davis激酶结合亲和力数据集和KIBA大规模激酶抑制剂生物活性数据集上均获得了最高的一致性指数和最低的均方误差值。结论:采用双向语言模型对靶蛋白进行编码后再进行深度学习,可以提高“药物-靶点”亲和力预测的准确度。
Objective To establish a method of predicting the drug-target affinity based on deep neural network.Methods First,the proposed drugs were encoded by One-Hot Encoding,and the important information of amino acid sequence was obtained by bidirectional long short-term memory language model.Second,two convolution neural network were designed,predicting the affinity of drug-target by four fully connected layer.Finally,the accuracy of the proposed method was validated on Davis dataset and KIBA dataset,and the experimental results were compared with those of KronRLS,SimBoost and DeepDTA.Results Compared with KronRLS,SimBoost and DeepDTA,this method obtained the highest concordance index and the lowest mean squared error on Davis dataset and KIBA dataset.Conclusions Using bidirectional language model to encode target protein before deep learning could increase the accuracy of predicting drug-target affinity.
作者
李添添
王俊杰
LI Tian-tian;WANG Jun-jie(Editioral Department,Jiangsu Province Official Hospital,Nanjing 210024,Jiangsu Province,China;School of Biomedical Engineering and Informatics,Nanjing Medical University,Nanjing 211166,Jiangsu Province,China)
出处
《中华医学图书情报杂志》
CAS
2022年第3期34-39,共6页
Chinese Journal of Medical Library and Information Science
基金
国家自然科学基金青年基金项目“多粒度多任务化合物-蛋白质相互作用预测方法研究”(62102191)。