摘要
本文针对存在巨大数量变量体系,如采用遗传变量筛选的反向传播人工神经网络方法(GA-BPANN)进行构效建模,由于BPANN计算收敛速度慢且易陷入局部极小而使得计算需要耗费大量训练拟合时间的特点,提出了在扩展逐步线性回归(FSR)筛选得到变量集的基础上,利用GA-BPANN分析其中的非线性关系的方法。首先对收集到的120个青蒿素类化合物,采用DFT/B3PW91/6-31G量子化学方法优化其结构,然后计算得到57个非零分子全息电距矢量(MHEDV)结构参数。5种不同隐层神经元数目的GA-BPANN拟合分析结果均优于直接逐步线性回归(FSR)分析结果,其中n-4-1的BPNN网络的最优模型结果为:R^2=0.900,S=0.493,F=787.936,R_(ex)~2=0.840,S_(ex)=0.730,F_(ex)=147.341。该分析结果表明:基于扩展FSR筛选的变量集的GA-BPANN方法确实是1种优秀且经济可行的预测先导化合物活性好方法。
Because the backpropagation neural network using genetic algorithm (GA-BPANN) to select variables from huge variable pool including overlapped variables often require much time for problems of local minimum and slowly converging. So an improved method was used to improve the performance of GA-BPANN based on the variables selected by the extended FSR. Based on 13 atomic types, molecular electronegativity distance vectors of 120 derivatives of artemisinin, where their structures were optimized by using the density functional theory method at B3PWgl/6-31G level, were calculated. 5 types of GA-BAPNN with different number of cells in hidden layer of BPANN were used to construct QSAR model. The results intraining and test sets were all better than that of FSR, where the final model with 4 cell in hidden layer of BPNN was R2=0.900, S=0.493, F=787.936, Rex2=0.840, Sex=0.730, Fex=147.341. The results also illustrate that the method is excellent and economical to construct QSAR model to predict the activity.
出处
《计算机与应用化学》
CAS
CSCD
北大核心
2010年第9期1257-1262,共6页
Computers and Applied Chemistry
基金
四川省教育厅青年基金资助项目(09ZB038)
关键词
青蒿素
MHEDV
遗传算法
反传神经网络
变量筛选
artemisinin, MHEDV, genetic algrithm, backpropagation neural network, feature variable selected