摘要
以分子量,脂水分配系数(logP)、水溶解度(logS)以及氢键供体数(HD)4个结构描述符为基础,通过多元线性回归和支持向量机回归对53个化合物建立了定量构效关系模型。全部化合物被随机分为包含41个化合物训练集和包含12个化合物的测试集。在分别得到多元线性回归和支持向量机回归模型后,进行了相关系数r、标准偏差s、平均绝对误差和均方差的统计分析。分析的结果说明两个模型都对脑血分配系数的对数值(logBB)有较好的预测能力,支持向量机回归作为非线性分析方法对于logBB的预测有一定的优势。
Blood brain partitioning (logBB) is a key consideration in the process of drug design since good permeation ability is required in order for drugs to reach the brain. A data set of 53 compounds, taken from the previous literature, has been used for training and testing of new models. Four descriptors were used to describe the structural characteristics of drugs and other organic compounds. Two quantitative structure activity relationship (QSAR) models were constructed using multiple linear regression and support vector machine regression. Statistical analyses of the models, including correlation coefficient r, standard deviation s, mean abstract error MAE and root mean square RMS, are reported. It was found that both models show good ability to predict the value of LogBB, but that the support vector machine method is superior to multiple linear regression.
出处
《北京化工大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2008年第3期65-69,共5页
Journal of Beijing University of Chemical Technology(Natural Science Edition)
基金
国家自然科学基金(20605003)
国家'863'计划(2006AA02Z337)
关键词
脑血分配系数
定量构效关系
多元线性回归
支持向量机
blood brain partitioning (logBB)
quantitative structure activity relationships
multiple linear regression
support vector machine