期刊文献+

应用机器学习方法构建药物分子解离速率常数的预测模型 被引量:4

Machine-Learning Model for Predicting the Rate Constant of ProteinLigand Dissociation
下载PDF
导出
摘要 越来越多的研究表明:药物分子与靶标分子的结合动力学性质与其在体内的药效有很强的相关性。因此,以改善结合动力学性质为导向的分子设计为药物研发提供了新的思路。本工作的研究目标在于得出预测药物分子解离速率常数(koff)的通用型定量结构-动力学关系(QSKR)模型。我们从文献中收集了406个配体分子的解离速率常数实验值,采用分子模拟方法构建了所有配体与靶蛋白复合物的三维结构模型。然后基于蛋白-配体原子对描述符,采用随机森林算法来构建预测配体分子解离速率常数的QSKR模型。通过探索不同条件(如距离区间,划分区间宽度和特征选择标准)下产生的描述符集合对模型预测精度的影响,确定当采用距离阈值为15?、划分区间宽度为3?、特征选择方差水平为2时得到的QSKR模型为最优,在两个独立测试集上获得良好的预测精度(相关系数为0.62)。本工作对预测药物分子解离速率常数这一关键科学问题进行了有益的探索,可为后续研究提供思路。 An increasing number of recent studies have shown that the binding kinetics of a drug molecule to its target correlates strongly with its efficacy in vivo.Therefore,ligand optimization oriented to improved binding kinetics provides new ideas for rational drug design.Currently,ligand binding kinetics is modeled mainly through extensive molecular dynamics simulations,which limits its application to real-world problems.The present study aimed at obtaining a general-purpose quantitative structure-kinetics relationship(QSKR)model for predicting the dissociation rate constant(koff)of a ligand based on its complex structure.This type of model is expected to be suitable for highthroughput tasks in structure-based drug design.We collected the experimentally measured koff values for 406 ligand molecules from literature,and then constructed a three-dimensional structural model for each protein-ligand complex through molecular modeling.A training set was compiled using 60%of those complexes while the remaining 40%were assigned to two test sets.Based on distance-dependent protein-ligand atom pair descriptors,a random forest algorithm was adopted to derive a QSKR model.Various random forest models were then generated based on the descriptor sets obtained under different conditions,such as distance cutoff,bin width,and feature selection criteria.The cross-validation results of those models were then examined.It was observed that the optimal model was obtained when the distance cutoff was 15?(1?=0.1 nm),the bin width was 3?,and feature selection variance level was 2.The final QSKR model produced correlation coefficients around 0.62 on the two independent test sets.This level of accuracy is at least comparable to that of the predictive models described in literature,which are typically computationally much more expensive.Our study attempts to address the issue of predicting koff values in drug design.We hope that it can provide inspiration for further studies by other researchers.
作者 苏敏仪 刘慧思 林海霞 王任小 Minyi Su;Huisi Liu;Haixia Lin;Renxiao Wang(State Key Laboratory of Bioorganic and Natural Products Chemistry,Shanghai Institute of Organic Chemistry,Chinese Academy of Sciences,Shanghai 200032,P.R.China;University of Chinese Academy of Sciences,Beijing 100049,P.R.China;Department of Chemistry,College of Sciences,Shanghai University,Shanghai 200444,P.R.China.)
出处 《物理化学学报》 SCIE CAS CSCD 北大核心 2020年第1期179-187,共9页 Acta Physico-Chimica Sinica
基金 中国科技部重点研发项目(2016YFA0502302) 国家自然科学基金(81725022,81430083,21661162003,21673276,21472227,21472226) 中国科学院先导项目(XDB20000000)资助.
关键词 解离速率常数 配体结合动力学 随机森林模型 蛋白-配体相互作用 基于结构的药物设计 Dissociation rate constant Ligand binding kinetics Random forest model Protein-ligand interaction Structure-based drug design
  • 相关文献

同被引文献43

引证文献4

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部