摘要
本文建立了2个180个含苯基的羧酸类化合物酸碱解离常数(pKa)的定量预测模型。这些化合物分子量在122.12到288.34的范围内,包含H,C,N,O,S,F,Cl,Br及I等元素.使用Cerius^2程序计算236个分子描述符来表述这些化合物,并使用统计学方法从中选择了12个描述符.分别使用多元线性回归分析(MLR)及支持向量机回归(SVM)结合10重交互检验方法来预测pKa数值.多元线性回归模型对pKa的预测结果相关系数为0.90,标准偏差为0.32;支持向量机模型结果较好,相关系数为0.91,标准偏差为0.31.
Two quantitative models for the prediction of the pKa values of 180 aromatic carboxylic acids were developed. These compounds contain elements such as H, C, N, O, S, F, CI, Br, and I with the molecular weight in the range of 122.12 to 288.34. The compounds were represented by 236 molecular descriptors calculated by using the CeriusZprogram. Twelve descriptors were selected by using the statistical methods. The pKa values were predicted by the Multilinear Regression (MLR) analysis and the Support Vector Machine (SVM) Regression in combination with 10-fold cross validation method. The model based on MLR analysis has a correlation coefficient of 0.90 for the predicted result of pKa, and the standard deviation of 0.32 pKa units, the model based on SVM regression has better result with a correlation coefficient of 0.91, and the standard deviation of 0.31 pKa units.
出处
《计算机与应用化学》
CAS
CSCD
北大核心
2009年第12期1559-1562,共4页
Computers and Applied Chemistry
基金
supported bythe National Natural Science Foundation of China(20605003)
National High Tech Project(2006AA02Z337)
SRF for ROCS,and the"Special Funding for the Talent Enrollment"of Beijing University of Chemical Technology~~
关键词
酸碱解离常数
含苯基羧酸类化合物
多元线性回归
支持向量机
定量构效关系
pKa values, aromatic carboxylic acids, multilinear regression, support vector machine, quantitative structure activity relationships (QSAR)