摘要
目的建立一种预测化学致癌物TD50的计算机预测模型。方法以定量构效关系(QSAR)方法为基础,利用CPDB(Carcinogenic Potency Database)数据库建立模型的训练集和验证集,通过对训练集中分子结构的解析和计算,以分子的键邻接矩阵作为计算基础,将与键邻接的原子性质分量通过计算公式转换为键的分量,并作为键的权值列入键邻接矩阵中,然后计算该矩阵的k次幂(0≤k≤15),进而计算出这些矩阵的谱矩(即矩阵的迹)。最后利用多元回归分析方法,将CPDB中化合物半数致癌量(TD50)数据作为因变量,将谱矩作为独立变量建立回归方程,并利用验证集数据对该结果进行验证。结果利用训练集数据建立回归方程的统计参数中,判定系数r为0.93524548,显著性检验结果F值为33.73586。对于验证集数据,观测值与模型的预测值基本处于95%的可信区间范围内。结论通过此方法获得的计算模型能较正确地与CPDB中的数据吻合,为化合物毒性预测提供了一种可行的解决方法。
Objective To build a kind of computer predicting model to predict the chemical carcinogen TDs0. Methods Building the training sets and the validation sets of the model by using Carcinogenic Potency Database and on the basis of QSAR method. Then performing analysis and calculations on the molecule structures in the training sets. By using bond adjacent matrix of molecules as the calculation basis, the arithmetic convert the atom property weight adjacent bond to the bond weight through calculating formula and list it in the bond adjacent matrix as the weight of the bond, then calculate k-order (0 ≤ k ≤ 15 ) of the matrix, and then calculate the spectral moments of the matrix . Finally by using multiple regression analysis, establishing regression equation with the data of TDs0 of the compounds in CPDB as dependent variable and the spectral moments as independent variable and then testing the results by using the data in the validation sets. Results In the statistical parameter of the regression equation established by using data in training sets, deciding coefficient r is 0. 93524548 ,and the result of the significance test F is 33. 73586. as to the data in validation sets, observations and predictive values of the model are generally in the 95% confidential interval. Conclusion The computer model obtained by this method can comparatively correctly tally with the data in CPDB, and thus provide a feasible method to predict chemical toxicity.
出处
《毒理学杂志》
CAS
CSCD
北大核心
2011年第6期406-410,共5页
Journal of Toxicology