摘要
选用60个结构多样的HCV复制抑制剂分子作为数据集,随机选择其中46个分子作为训练集,剩余14个分子作为验证集.采用多元线性回归(MLR)和主成分分析(PCA)方法对每个分子的646个理化和结构参数进行了线性回归分析,并分别建立各自的最优模型.结果表明MLR中的逐步和向前法所建模型最佳,模型结果为:训练集R2=0.827,验证集R2=0.850,模型能够直观地反映影响化合物活性的主要因素.该模型将有助于筛选和开发新的HCV复制抑制.
Recently,a database including 60substituted pyrimidines as HCV replication(replicase)inhibitors with diversified structures was built,in which 46compounds randomly selected as the training set and the rest as the test set.In all,646molecular indices were regressed by multiple liner regression(MLR)and principle component analysis(PCA)methods.As a result,the stepwise and forward regression analysis were the optimum methods,presenting a statistics result of training set R2=0.827,test set R2=0.850,which can directly reflect the main factors affecting activities of the molecules.The model is helpful for the further researching and development of new efficient HCV replication inhibitors.
出处
《分子科学学报》
CAS
CSCD
北大核心
2014年第3期246-251,共6页
Journal of Molecular Science
基金
国家自然科学基资助项目(11201049)