The relative toxicity of 48 anilines using the Tetrahymena pyriformis population growth characteristics IGC50 (concentration causing 50% growth inhibition), available in the literature, was studied. At first, the en...The relative toxicity of 48 anilines using the Tetrahymena pyriformis population growth characteristics IGC50 (concentration causing 50% growth inhibition), available in the literature, was studied. At first, the entire data set was randomly split into a training set (31 chemicals) used to establish the QSAR model, and a test set (17 chemicals) for statistical external validation. A biparametric model was developed using, as independent variables, 3D theoretical descriptors derived from DRAGON software. The GA-MLR (genetic algorithm variable subset selection) procedure was performed on the trainingset by the software mobydigs using the OLS (ordinary least squares) regression method, and GA(genetic algorithm)-VSS(variable subset selection) by maximising the cross-validated explained variance (Q^2Loo)' The obtained model was examined for robustness (Q^2LOOcross-validation, Y-scrambling) and predictive ability through both internal (Q^2LM0, bootstrap) and external validation (Q^2ext) methods. Descriptors included in the QSAR model indicated that log/GC^-150 value was related to molecular size and shape, and interaction of molecule with its surrounding medium or its target. Moreover, the applicability domain of the model was discussed.展开更多
文摘The relative toxicity of 48 anilines using the Tetrahymena pyriformis population growth characteristics IGC50 (concentration causing 50% growth inhibition), available in the literature, was studied. At first, the entire data set was randomly split into a training set (31 chemicals) used to establish the QSAR model, and a test set (17 chemicals) for statistical external validation. A biparametric model was developed using, as independent variables, 3D theoretical descriptors derived from DRAGON software. The GA-MLR (genetic algorithm variable subset selection) procedure was performed on the trainingset by the software mobydigs using the OLS (ordinary least squares) regression method, and GA(genetic algorithm)-VSS(variable subset selection) by maximising the cross-validated explained variance (Q^2Loo)' The obtained model was examined for robustness (Q^2LOOcross-validation, Y-scrambling) and predictive ability through both internal (Q^2LM0, bootstrap) and external validation (Q^2ext) methods. Descriptors included in the QSAR model indicated that log/GC^-150 value was related to molecular size and shape, and interaction of molecule with its surrounding medium or its target. Moreover, the applicability domain of the model was discussed.