A quantitative structure-property relationship (QSPR) study has been made for the prediction of the surface tension of nonionic surfactants in aqueous solution. The regressed model includes a topological descriptor, ...A quantitative structure-property relationship (QSPR) study has been made for the prediction of the surface tension of nonionic surfactants in aqueous solution. The regressed model includes a topological descriptor, the Kier & Hall index of zero order (KH0) of the hydrophobic segment of surfactant and a quantum chemical one, the heat of formation (fHD) of surfactant molecules. The established general QSPR between the surface tension and the descriptors produces a correlation coefficient of multiple determination, 2r=0.9877, for 30 studied nonionic surfactants.展开更多
Many structure-property/activity studies use graph theoretical indices, which are based on the topological properties of a molecule viewed as a graph. Since topological indices can be derived directly from the molecul...Many structure-property/activity studies use graph theoretical indices, which are based on the topological properties of a molecule viewed as a graph. Since topological indices can be derived directly from the molecular structure without any experimental effort, they provide a simple and straightforward method for property prediction. In this work the flash point of alkanes was modeled by a set of molecular connectivity indices (Х), modified molecular connectivity indices ( ^mХ^v ) and valance molecular connectivity indices ( ^mХ^v ), with ^mХ^v calculated using the hydrogen perturbation. A stepwise Multiple Linear Regression (MLR) method was used to select the best indices. The predicted flash points are in good agreement with the experimental data, with the average absolute deviation 4.3 K.展开更多
Twenty eight alkyl(1-phenylsulfonyl) cycloalkane carboxylates were computed at the B3LYP/6-31G* level. Based on linear solvation energy theory, two quantitative correlation equations of the molecular structures of alk...Twenty eight alkyl(1-phenylsulfonyl) cycloalkane carboxylates were computed at the B3LYP/6-31G* level. Based on linear solvation energy theory, two quantitative correlation equations of the molecular structures of alkyl(1-phenylsulfonyl) cycloalkane carboxylate com- pounds to their chromatographic retention (capacity factor lgKW) and the toxicity for photo- bacterium phosphoreum (–lgEC50) were developed by using the molecular structural parameters as theoretical descriptors (r2 = 0.9501, 0.9488). The two quantitative correlation equations were consequently cross validated by leave-one-out (LOO) validation method with q2 of 0.9113 and 0.9281, respectively. The result showed that the two equations achieved in this work by B3LYP/6-31G* are both more advantageous than those from AM1, and can be used to predict the lgKW and –lgEC50 of congeneric organics.展开更多
Based on the quantum chemical descriptors,quantitative structure-property relationship(QSPR) models have been developed to estimate and predict the photodegradation rate constant(logK) of polycyclic aromatic hydro...Based on the quantum chemical descriptors,quantitative structure-property relationship(QSPR) models have been developed to estimate and predict the photodegradation rate constant(logK) of polycyclic aromatic hydrocarbons(PAHs) by use of linear method(multiple linear regression,MLR) and non-linear method(back propagation artificial neural network,BP-ANN).A BP-ANN with 3-3-1 architecture was generated by using three quantum chemical descriptors appearing in the MLR model.The standard heat of formation(HOF),the gap of frontier molecular orbital energies(ΔELH) and total energy(TE) were inputs and its output was logK.Leave-One-Out(LOO) Cross-Validated correlation coefficient(R^2CV) of the established MLR and BP-ANN models were 0.6383 and 0.7843,respectively.The nonlinear BP-ANN model has better predictive ability compared to the linear MLR model with the root mean square error(RMSE) for training and validation sets to be 0.1071,0.1514 and the squared correlation coefficient(R^2) of 0.9791,0.9897,respectively.In addition,some insights into the molecular structural features affecting the photodegradation of PAHs were also discussed.展开更多
Based on two-dimensional topological structures, a novel molecular electronegativity interaction vector with hybridization (MEHIV) was developed to describe atomic hybridization state in different molecular environm...Based on two-dimensional topological structures, a novel molecular electronegativity interaction vector with hybridization (MEHIV) was developed to describe atomic hybridization state in different molecular environments. Five quantitative models by MEHIV characterization and multiple linear regression modeling were successfully established to predict reduced ion mobility constants (Ko) of alkanes, aromatic hydrocarbons, fatty alcohols, fatty aldehydes and ketones and carboxylic esters. The correlation coefficients Roy by leave-one-out cross-validation are 0.792, 0.787, 0,949, 0.972 and 0.981, respectively, and the standard deviations SDcv are 0.067, 0.086, 0.064, 0.043 and 0.042, respectively. These results suggested that MEHIV is an excellent topological index descriptor with many advantages such as straightforward physicochemical meaning, high characterization competence, convenient expansibility and easy manipulation.展开更多
Chemical industry is always seeking opportunities to efficiently and economically convert raw materials to commodity chemicals and higher value-added chemicalbased products.The life cycles of chemical products involve...Chemical industry is always seeking opportunities to efficiently and economically convert raw materials to commodity chemicals and higher value-added chemicalbased products.The life cycles of chemical products involve the procedures of conceptual product designs,experimental investigations,sustainable manufactures through appropriate chemical processes and waste disposals.During these periods,one of the most important keys is the molecular property prediction models associating molecular structures with product properties.In this paper,a framework combining quantum mechanics and quantitative structure-property relationship is established for fast molecular property predictions,such as activity coefficient,and so forth.The workflow of framework consists of three steps.In the first step,a database is created for collections of basic molecular information;in the second step,quantum mechanics-based calculations are performed to predict quantum mechanics-based/derived molecular properties(pseudo experimental data),which are stored in a database and further provided for the developments of quantitative structure-property relationship methods for fast predictions of properties in the third step.The whole framework has been carried out within a molecular property prediction toolbox.Two case studies highlighting different aspects of the toolbox involving the predictions of heats of reaction and solid-liquid phase equilibriums are presented.展开更多
In this paper the photolysis half-lives of the model dyes in water solutions and under ultraviolet (UV) radiation were determined by using a continuous-flow spectrophotometric method. A quantitative structure- prope...In this paper the photolysis half-lives of the model dyes in water solutions and under ultraviolet (UV) radiation were determined by using a continuous-flow spectrophotometric method. A quantitative structure- property relationship (QSPR) study was carried out using 21 descriptors based on different chemometric tools including stepwise multiple linear regression (MLR) and partial least squares (PLS) for the prediction of the photolysis half-life (t1/2) of dyes. For the selection of test set compounds, a K-means clustering technique was used to classify the entire data set, so that all clusters were properly represented in both training and test sets. The QSPR results obtained with these models show that in MLR-derived model, photolysis half-lives of dyes depended strongly on energy of the highest occupied molecular orbital (EHoMO), largest electron density of an atom in the molecule (ED^+) and lipophilicity (logP). While in the model derived from PLS, besides aforementioned EHOMO and ED^+ descriptors, the molecular surface area (Sm), molecular weight (M-W), electronegativity (X), energy of the second highest occupied molecular orbital (EHoMO- 1) and dipole moment (μ) had dominant effects on logt1/2 values of dyes. These were applicable for all classes of studied dyes (including monoazo, disazo, oxazine, sulfo- nephthaleins and derivatives of fluorescein). The results were also assessed for their consistency with findings from other similar studies.展开更多
In order to predict the critical micelle concentration (cmc) of nonionic surfactants in aqueous solution,a quantitative structure-property relationship (QSPR) was found for 77 nonionic surfactants belonging to eight s...In order to predict the critical micelle concentration (cmc) of nonionic surfactants in aqueous solution,a quantitative structure-property relationship (QSPR) was found for 77 nonionic surfactants belonging to eight series. The best-regressed model contained four quantum-chemical descriptors,the heat of formation (ΔH),the molecular dipole moment (D),the energy of the lowest unoccupied molecular orbital (E_ LUMO ) and the energy of the highest occupied molecular orbital (E_ HOMO ) of the surfactant molecule; two constitutional descriptors,the molecular weight of surfactant (M) and the number of oxygen and nitrogen atoms (n_ ON ) of the hydrophilic fragment of surfactant molecule; and one topological descriptor,the Kier & Hall index of zero order (KH0) of the hydrophobic fragment of the surfactant. The established general QSPR between lg(cmc) and the descriptors produced a relevant coefficient of multiple determination:R 2=0.986. When cross terms were considered,the corresponding best model contained five descriptors E_ LUMO ,D,KH0,M and a cross term n_ ON ·KH0,which also produced the same coefficient as the seven-parameter model.展开更多
Considering atomic property vector and atomic correlative function, the 3-dimensional structural vector of atomic property correlation (3D-VAPC), a novel descriptor,is defined to characterize a 3-dimensional molecul...Considering atomic property vector and atomic correlative function, the 3-dimensional structural vector of atomic property correlation (3D-VAPC), a novel descriptor,is defined to characterize a 3-dimensional molecular structure by introducing self-adaptability regulation mechanism and the idea of orientating to customers. Characterizing the structures of 25 bisphenol A compounds by this vector, the QSAR models of three kinds of estrogen activities (ER affinities, gene induction and cell proliferation) have high multiple correlation coefficient (Rcum^2=0.933, 0.813, 0.959) and cross verification coefficient (Qcum^2=0.847, 0.953, 0.798) by support vector machine (SVM), which suits for nonlinear circumstances. The above results show that the models successfully express the correlation between structure and three kinds of estrogen activities. Therefore, 3D-VAPC exactly reflects the molecular structural information and SVM method correctly describes the correlation between information and property of the compounds.展开更多
An integrated approach is proposed to predict the chromatographic retention time of oligonucleotides based on quantitative structure-retention relationships (QSRR) models. First, the primary base sequences of oligon...An integrated approach is proposed to predict the chromatographic retention time of oligonucleotides based on quantitative structure-retention relationships (QSRR) models. First, the primary base sequences of oligonucleotides are translated into vectors based on scores of generalized base properties (SGBP), involving physicochemical, quantum chemical, topological, spatial structural properties, etc.; thereafter, the sequence data are transformed into a uniform matrix by auto cross covariance (ACC). ACC accounts for the interactions between bases at a certain distance apart in an oligonucleotide sequence; hence, this method adequately takes the neighboring effect into account. Then, a genetic algorithm is used to select the variables related to chromatographic retention behavior of oligonuclcotides. Finally, a support vector machine is used to develop QSRR models to predict chromatographic retention behavior. The whole dataset is divided into pairs of training sets and test sets with different proportions; as a result, it has been found that the QSRR models using more than 26 training samples have an appropriate external power, and can accurately represent the relationship between the features of sequences and structures, and the retention times. The results indicate that the SGBP-ACC approach is a useful structural representation method in QSRR of oligonucleotides due to its many advantages such as plentiful structural information, easy manipulation and high characterization competence. Moreover, the method can further be applied to predict chromatographic retention behavior of oligonucleotides.展开更多
The amidoximated polyacrylonitrile (PAN) fiber Fe complexeswere prepared and used as the heterogeneous Fenton catalysts for thedegradation of28 anionicwater soluble azodyes inwater under visible irradiation. The mul...The amidoximated polyacrylonitrile (PAN) fiber Fe complexeswere prepared and used as the heterogeneous Fenton catalysts for thedegradation of28 anionicwater soluble azodyes inwater under visible irradiation. The multiple linear regression (MLR) methodwas employed todevelop the quantitative structure property relationship (QSPR) model equations for thedecoloration and mineralization of azodyes. Moreover, the predictive ability of the QSPR model equationswas assessed using Leave-one-out (LOO) and cross-validation (CV) methods. Additionally, the effect of Fe content of catalyst and the sodium chloride inwater on QSPR model equationswere also investigated. The results indicated that the heterogeneous photo-Fentondegradation of the azodyeswithdifferent structureswas conducted in the presence of the amidoximated PAN fiber Fe complex. The QSPR model equations for thedyedecoloration and mineralizationwere successfullydeveloped using MLR technique. MW/S (molecularweightdivided by the number of sulphonate groups) and N N=N (the number of azo linkage) are considered as the most importantdetermining factor for thedyedegradation and mineralization, and there is a significant negative correlation between MW/S or N N=N anddegradation percentage or total organic carbon (TOC) removal. Moreover, LOO and CV analysis suggested that the obtained QSPR model equations have the better prediction ability. The variation in Fe content of catalyst and the addition of sodium chloridedid not alter the nature of the QSPR model equations.展开更多
基金the National Natural Science Foundation of China(to grant No.29903006 and 29973023)the Visiting Scholar Foundation of Key Laboratory in University of China for their financial support
文摘A quantitative structure-property relationship (QSPR) study has been made for the prediction of the surface tension of nonionic surfactants in aqueous solution. The regressed model includes a topological descriptor, the Kier & Hall index of zero order (KH0) of the hydrophobic segment of surfactant and a quantum chemical one, the heat of formation (fHD) of surfactant molecules. The established general QSPR between the surface tension and the descriptors produces a correlation coefficient of multiple determination, 2r=0.9877, for 30 studied nonionic surfactants.
文摘Many structure-property/activity studies use graph theoretical indices, which are based on the topological properties of a molecule viewed as a graph. Since topological indices can be derived directly from the molecular structure without any experimental effort, they provide a simple and straightforward method for property prediction. In this work the flash point of alkanes was modeled by a set of molecular connectivity indices (Х), modified molecular connectivity indices ( ^mХ^v ) and valance molecular connectivity indices ( ^mХ^v ), with ^mХ^v calculated using the hydrogen perturbation. A stepwise Multiple Linear Regression (MLR) method was used to select the best indices. The predicted flash points are in good agreement with the experimental data, with the average absolute deviation 4.3 K.
基金This work was financially supported by the National Basic Research Program of China (2003CB415002), the China Postdoctoral Science Foundation (No. 2003033486) and the Natural Science Research Fund of University in Jiangsu (04KJB150149)
文摘Twenty eight alkyl(1-phenylsulfonyl) cycloalkane carboxylates were computed at the B3LYP/6-31G* level. Based on linear solvation energy theory, two quantitative correlation equations of the molecular structures of alkyl(1-phenylsulfonyl) cycloalkane carboxylate com- pounds to their chromatographic retention (capacity factor lgKW) and the toxicity for photo- bacterium phosphoreum (–lgEC50) were developed by using the molecular structural parameters as theoretical descriptors (r2 = 0.9501, 0.9488). The two quantitative correlation equations were consequently cross validated by leave-one-out (LOO) validation method with q2 of 0.9113 and 0.9281, respectively. The result showed that the two equations achieved in this work by B3LYP/6-31G* are both more advantageous than those from AM1, and can be used to predict the lgKW and –lgEC50 of congeneric organics.
基金supported by the Natural Science Foundation of Fujian Province (D0710019)the Natural Science Foundation of Overseas Chinese Affairs Office of the State Council (06QZR09)
文摘Based on the quantum chemical descriptors,quantitative structure-property relationship(QSPR) models have been developed to estimate and predict the photodegradation rate constant(logK) of polycyclic aromatic hydrocarbons(PAHs) by use of linear method(multiple linear regression,MLR) and non-linear method(back propagation artificial neural network,BP-ANN).A BP-ANN with 3-3-1 architecture was generated by using three quantum chemical descriptors appearing in the MLR model.The standard heat of formation(HOF),the gap of frontier molecular orbital energies(ΔELH) and total energy(TE) were inputs and its output was logK.Leave-One-Out(LOO) Cross-Validated correlation coefficient(R^2CV) of the established MLR and BP-ANN models were 0.6383 and 0.7843,respectively.The nonlinear BP-ANN model has better predictive ability compared to the linear MLR model with the root mean square error(RMSE) for training and validation sets to be 0.1071,0.1514 and the squared correlation coefficient(R^2) of 0.9791,0.9897,respectively.In addition,some insights into the molecular structural features affecting the photodegradation of PAHs were also discussed.
基金the State Key Laboratory of Chemo/Biosensing and Chemometrics Foundation(No.05-12-1)
文摘Based on two-dimensional topological structures, a novel molecular electronegativity interaction vector with hybridization (MEHIV) was developed to describe atomic hybridization state in different molecular environments. Five quantitative models by MEHIV characterization and multiple linear regression modeling were successfully established to predict reduced ion mobility constants (Ko) of alkanes, aromatic hydrocarbons, fatty alcohols, fatty aldehydes and ketones and carboxylic esters. The correlation coefficients Roy by leave-one-out cross-validation are 0.792, 0.787, 0,949, 0.972 and 0.981, respectively, and the standard deviations SDcv are 0.067, 0.086, 0.064, 0.043 and 0.042, respectively. These results suggested that MEHIV is an excellent topological index descriptor with many advantages such as straightforward physicochemical meaning, high characterization competence, convenient expansibility and easy manipulation.
基金The authors are grateful for the financial supports of the National Natural Science Foundation of China(Grant Nos.22078041 and 21808025)the Fundamental Research Funds for the Central Universities(Grant No.DUT20JC41).
文摘Chemical industry is always seeking opportunities to efficiently and economically convert raw materials to commodity chemicals and higher value-added chemicalbased products.The life cycles of chemical products involve the procedures of conceptual product designs,experimental investigations,sustainable manufactures through appropriate chemical processes and waste disposals.During these periods,one of the most important keys is the molecular property prediction models associating molecular structures with product properties.In this paper,a framework combining quantum mechanics and quantitative structure-property relationship is established for fast molecular property predictions,such as activity coefficient,and so forth.The workflow of framework consists of three steps.In the first step,a database is created for collections of basic molecular information;in the second step,quantum mechanics-based calculations are performed to predict quantum mechanics-based/derived molecular properties(pseudo experimental data),which are stored in a database and further provided for the developments of quantitative structure-property relationship methods for fast predictions of properties in the third step.The whole framework has been carried out within a molecular property prediction toolbox.Two case studies highlighting different aspects of the toolbox involving the predictions of heats of reaction and solid-liquid phase equilibriums are presented.
文摘In this paper the photolysis half-lives of the model dyes in water solutions and under ultraviolet (UV) radiation were determined by using a continuous-flow spectrophotometric method. A quantitative structure- property relationship (QSPR) study was carried out using 21 descriptors based on different chemometric tools including stepwise multiple linear regression (MLR) and partial least squares (PLS) for the prediction of the photolysis half-life (t1/2) of dyes. For the selection of test set compounds, a K-means clustering technique was used to classify the entire data set, so that all clusters were properly represented in both training and test sets. The QSPR results obtained with these models show that in MLR-derived model, photolysis half-lives of dyes depended strongly on energy of the highest occupied molecular orbital (EHoMO), largest electron density of an atom in the molecule (ED^+) and lipophilicity (logP). While in the model derived from PLS, besides aforementioned EHOMO and ED^+ descriptors, the molecular surface area (Sm), molecular weight (M-W), electronegativity (X), energy of the second highest occupied molecular orbital (EHoMO- 1) and dipole moment (μ) had dominant effects on logt1/2 values of dyes. These were applicable for all classes of studied dyes (including monoazo, disazo, oxazine, sulfo- nephthaleins and derivatives of fluorescein). The results were also assessed for their consistency with findings from other similar studies.
文摘In order to predict the critical micelle concentration (cmc) of nonionic surfactants in aqueous solution,a quantitative structure-property relationship (QSPR) was found for 77 nonionic surfactants belonging to eight series. The best-regressed model contained four quantum-chemical descriptors,the heat of formation (ΔH),the molecular dipole moment (D),the energy of the lowest unoccupied molecular orbital (E_ LUMO ) and the energy of the highest occupied molecular orbital (E_ HOMO ) of the surfactant molecule; two constitutional descriptors,the molecular weight of surfactant (M) and the number of oxygen and nitrogen atoms (n_ ON ) of the hydrophilic fragment of surfactant molecule; and one topological descriptor,the Kier & Hall index of zero order (KH0) of the hydrophobic fragment of the surfactant. The established general QSPR between lg(cmc) and the descriptors produced a relevant coefficient of multiple determination:R 2=0.986. When cross terms were considered,the corresponding best model contained five descriptors E_ LUMO ,D,KH0,M and a cross term n_ ON ·KH0,which also produced the same coefficient as the seven-parameter model.
基金This work was supported by the Natural Science Foundation of CQ CSTC (No. 2006BB5177)
文摘Considering atomic property vector and atomic correlative function, the 3-dimensional structural vector of atomic property correlation (3D-VAPC), a novel descriptor,is defined to characterize a 3-dimensional molecular structure by introducing self-adaptability regulation mechanism and the idea of orientating to customers. Characterizing the structures of 25 bisphenol A compounds by this vector, the QSAR models of three kinds of estrogen activities (ER affinities, gene induction and cell proliferation) have high multiple correlation coefficient (Rcum^2=0.933, 0.813, 0.959) and cross verification coefficient (Qcum^2=0.847, 0.953, 0.798) by support vector machine (SVM), which suits for nonlinear circumstances. The above results show that the models successfully express the correlation between structure and three kinds of estrogen activities. Therefore, 3D-VAPC exactly reflects the molecular structural information and SVM method correctly describes the correlation between information and property of the compounds.
基金supported by the National Natural Science Foundation of China (10901169)National 111 Programme of Introducing Talents of Discipline to Universities (0507111106)+2 种基金Innovation Ability Training Foundation of Chongqing University (CDCX008)Innovative Group Program for Graduates of Chongqing University,ScienceInnovation Fund (200711C1A0010260)
文摘An integrated approach is proposed to predict the chromatographic retention time of oligonucleotides based on quantitative structure-retention relationships (QSRR) models. First, the primary base sequences of oligonucleotides are translated into vectors based on scores of generalized base properties (SGBP), involving physicochemical, quantum chemical, topological, spatial structural properties, etc.; thereafter, the sequence data are transformed into a uniform matrix by auto cross covariance (ACC). ACC accounts for the interactions between bases at a certain distance apart in an oligonucleotide sequence; hence, this method adequately takes the neighboring effect into account. Then, a genetic algorithm is used to select the variables related to chromatographic retention behavior of oligonuclcotides. Finally, a support vector machine is used to develop QSRR models to predict chromatographic retention behavior. The whole dataset is divided into pairs of training sets and test sets with different proportions; as a result, it has been found that the QSRR models using more than 26 training samples have an appropriate external power, and can accurately represent the relationship between the features of sequences and structures, and the retention times. The results indicate that the SGBP-ACC approach is a useful structural representation method in QSRR of oligonucleotides due to its many advantages such as plentiful structural information, easy manipulation and high characterization competence. Moreover, the method can further be applied to predict chromatographic retention behavior of oligonucleotides.
基金supported by the Research Program of Application Foundation and Advanced Technology from the Tianjin Municipal Science and Technology Committee(No.11JCZDJ24600)the Natural Science Foundationof China(No.20773093)
文摘The amidoximated polyacrylonitrile (PAN) fiber Fe complexeswere prepared and used as the heterogeneous Fenton catalysts for thedegradation of28 anionicwater soluble azodyes inwater under visible irradiation. The multiple linear regression (MLR) methodwas employed todevelop the quantitative structure property relationship (QSPR) model equations for thedecoloration and mineralization of azodyes. Moreover, the predictive ability of the QSPR model equationswas assessed using Leave-one-out (LOO) and cross-validation (CV) methods. Additionally, the effect of Fe content of catalyst and the sodium chloride inwater on QSPR model equationswere also investigated. The results indicated that the heterogeneous photo-Fentondegradation of the azodyeswithdifferent structureswas conducted in the presence of the amidoximated PAN fiber Fe complex. The QSPR model equations for thedyedecoloration and mineralizationwere successfullydeveloped using MLR technique. MW/S (molecularweightdivided by the number of sulphonate groups) and N N=N (the number of azo linkage) are considered as the most importantdetermining factor for thedyedegradation and mineralization, and there is a significant negative correlation between MW/S or N N=N anddegradation percentage or total organic carbon (TOC) removal. Moreover, LOO and CV analysis suggested that the obtained QSPR model equations have the better prediction ability. The variation in Fe content of catalyst and the addition of sodium chloridedid not alter the nature of the QSPR model equations.