In the present study,(QSRR) study had been carried out for volatile components from Rosa banksiae Ait.based on various quantum-chemical and physicochemical descriptors derived by B3LYP method.To build QSRR models,a ...In the present study,(QSRR) study had been carried out for volatile components from Rosa banksiae Ait.based on various quantum-chemical and physicochemical descriptors derived by B3LYP method.To build QSRR models,a multiple linear regression (MLR) stepwise method was used.The generated models have good predictive ability and are of high statistical significance with good correlation coefficients (R2≥0.734) and p values far less than 0.05.Preliminary results indicated that the application of the models,especially the prediction of GC retention time and linear retention index of volatile components from Rosa banksiae Ait.,will be helpful.The models contribute also to the identification of important quantum-chemical and physicochemical descriptors responsible for the retention time and linear retention index.It was found that the shape attribute (ShpA) and logP value play a vital role in determining component’s GC retention time and linear retention index which increase with the lipophilicity of volatile components.The larger the shape attribute of analyte is,the larger the deformability is,the stronger the interaction between analyte and stationary phase is,and the longer the GC retention time is,the larger the linear retention index is.The importance of E HOMO,q+,and SEV is also embodied in models,but they are not dominant.展开更多
Polychlorinated dibenzothiophenes(PCDTs) are classified as persistent organic pollutants in the environment,so the analysis of PCDTs by their gas chromatographic behaviors is of great significance.Quantitative struc...Polychlorinated dibenzothiophenes(PCDTs) are classified as persistent organic pollutants in the environment,so the analysis of PCDTs by their gas chromatographic behaviors is of great significance.Quantitative structure-retention relationship(QSRR) analysis is a useful technique capable of relating chromatographic retention time to the molecular structure.In this paper,a QSRR study of 37 PCDTs was carried out by using molecular electronegativity distance vector(MEDV) descriptors and multiple linear regression(MLR) and partial least-squares regression(PLS) methods.The correlation coefficient R of established MLR,PLS models,leave-one-out(LOO) cross-validation(CV),Q2ext were 0.9951,0.9942,0.9839(MLR) and 0.9925,0.9915,0.9833(PLS),respectively.Results showed that the model exhibited excellent estimate capability for internal sample set and good predictive capability for external sample set.By using MEDV descriptors,the QSRR model can provide a simple and rapid way to predict the gas-chromatographic retention indices of polychlorinated dibenzothiophenes in conditions of lacking standard samples or poor experimental conditions.展开更多
An integrated approach is proposed to predict the chromatographic retention time of oligonucleotides based on quantitative structure-retention relationships (QSRR) models. First, the primary base sequences of oligon...An integrated approach is proposed to predict the chromatographic retention time of oligonucleotides based on quantitative structure-retention relationships (QSRR) models. First, the primary base sequences of oligonucleotides are translated into vectors based on scores of generalized base properties (SGBP), involving physicochemical, quantum chemical, topological, spatial structural properties, etc.; thereafter, the sequence data are transformed into a uniform matrix by auto cross covariance (ACC). ACC accounts for the interactions between bases at a certain distance apart in an oligonucleotide sequence; hence, this method adequately takes the neighboring effect into account. Then, a genetic algorithm is used to select the variables related to chromatographic retention behavior of oligonuclcotides. Finally, a support vector machine is used to develop QSRR models to predict chromatographic retention behavior. The whole dataset is divided into pairs of training sets and test sets with different proportions; as a result, it has been found that the QSRR models using more than 26 training samples have an appropriate external power, and can accurately represent the relationship between the features of sequences and structures, and the retention times. The results indicate that the SGBP-ACC approach is a useful structural representation method in QSRR of oligonucleotides due to its many advantages such as plentiful structural information, easy manipulation and high characterization competence. Moreover, the method can further be applied to predict chromatographic retention behavior of oligonucleotides.展开更多
A newly developed descriptor, three- dimensional holographic vector of atomic interaction field (3D-HoVAIF), was used to describe the chemical structures of purine bases. After variable screening by stepwise multiple ...A newly developed descriptor, three- dimensional holographic vector of atomic interaction field (3D-HoVAIF), was used to describe the chemical structures of purine bases. After variable screening by stepwise multiple regression (SMR) technique, a partial least square (PLS) regression model was built with 3D-HoVAIF. The model was satisfactory com- paring to reference since correlation coefficients of molecular modeling ( Rc 2um), cross- validation ( Qc 2um) and standard deviation of estimation (SD) were 0.966, 0.860 and 0.112, respectively, showing that the model had favorable estimation and prediction capa- bilities. It was illustrated that information related to retention data of purine bases could preferably be expressed by 3D-HoVAIF with definite physico- chemical meanings and easy structural interpretation for purine bases. It was illustrated that 3D-HoVAIF was to preferably express retention data of purine bases and had definite physicochemical significance. So 3D-HoVAIF was a useful structural expression technique for quantitative structure activity (or prop- erty or retention) relationships (QSAR/QSPR/QSRR) study, such as structural characterization and chro- matographic retention prediction.展开更多
Based on the identical group as a pseudo atom instead of a typical atom, a novel modified molecular dis-tance-edge (MDE) vector μ was developed in our laboratory to characterize chemical structure of polychlorinated ...Based on the identical group as a pseudo atom instead of a typical atom, a novel modified molecular dis-tance-edge (MDE) vector μ was developed in our laboratory to characterize chemical structure of polychlorinated diben-zofurans (PCDFs) congeners and/or isomers. Quantitative structure-retention relationships (QSRRs) between the new VMDE parameters and gas chromatographic (GC) retention behavior of PCDFs were then generated by multiple linear regression (MLR) method for non-polar, moderately polar, and polar stationary phases. Four excellent models with high correlation coefficients, R=0.984-0.995, were proposed for non-polar columns (DB-5, SE-54, OV-101). For the moder-ately polar columns (OV-1701), the correlation coefficient of the developed good model is only 0.958. For the polar col-umns (SP-2300), the QSRR model is poor with R=0.884. Then cross validation with leave-one out of procedure (CV) is performed in high correlation with the non-polar (Rcv=992-0.974) and weakly polar (Rcv=921) columns and in little cor-relation (Rcv=0.834) with the polar columns. These results show that the new μ vector is suitable for describing the re-tention behaviors of PCDFs on non-polar and moderately polar stationary phases and not for the various gas chroma-tographic retention behaviors of PCDFs on the different po-larity-varying stationary phases.展开更多
A new quantitative structure-retention relationship (QSRR) model is developed for polychlorinated dibenzofurans (PCDFs) based on molecular interaction field (MIF) analysis. The MIF of all 135 PCDFs is calculated using...A new quantitative structure-retention relationship (QSRR) model is developed for polychlorinated dibenzofurans (PCDFs) based on molecular interaction field (MIF) analysis. The MIF of all 135 PCDFs is calculated using DRY, C1= and C3 probe, characterizing the hydrophobic and steric interaction between PCDFs and different groups of stationary phase. Then QSRR model is constructed by multiblock partial least squares (MBPLS), and the significance of each block is evaluated by the block importance in the prediction (BIP) method. The model used for prediction is statistically significant, with calibration and cross-validation correlation coefficients 0.9990 and 0.9980 respectively, and relative error less than 1.0%. The results of MBPLS and BIP show that the steric properties have dominant influence on the retention behavior of PCDFs, and then the hydrophobic effects.展开更多
To study the quantitative relationship between surface sedimentary diatoms and water depth,67 surface samples were collected for diatom analysis on eight profiles with water depth variation from the muddy intertidal z...To study the quantitative relationship between surface sedimentary diatoms and water depth,67 surface samples were collected for diatom analysis on eight profiles with water depth variation from the muddy intertidal zone to the shallow sea area in North-Central Bohai Bay,China.The results showed that the distribution of diatoms changed significantly in response to the change in water depth.Furthermore,the quantitative relationship between the distribution of dominant diatom species,their assemblages,and the water depth was established.The water depth optima for seven dominant species such as Cyclotella striata/stylorum,Paralia sulcata,and Coscinodiscus perforatus and the water depth indication range of seven diatom assemblages were obtained in the study area above the water depth(elevation)of-10 m.The quantitative relationship between surface sedimentary diatoms and water depth provides a proxy index for diatom-paleo-water depth reconstruction in the strata in Bohai Bay,China.展开更多
Direct coal liquefaction(DCL)is an important and effective method of converting coal into high-valueadded chemicals and fuel oil.In DCL,heating the direct coal liquefaction solvent(DCLS)from low to high temperature an...Direct coal liquefaction(DCL)is an important and effective method of converting coal into high-valueadded chemicals and fuel oil.In DCL,heating the direct coal liquefaction solvent(DCLS)from low to high temperature and pre-hydrogenation of the DCLS are critical steps.Therefore,studying the dissolution of hydrogen in DCLS under liquefaction conditions gains importance.However,it is difficult to precisely determine hydrogen solubility only by experiments,especially under the actual DCL conditions.To address this issue,we developed a prediction model of hydrogen solubility in a single solvent based on the machine-learning quantitative structure–property relationship(ML-QSPR)methods.The results showed that the squared correlation coefficient R^(2)=0.92 and root mean square error RMSE=0.095,indicating the model’s good statistical performance.The external validation of the model also reveals excellent accuracy and predictive ability.Molecular polarization(a)is the main factor affecting the dissolution of hydrogen in DCLS.The hydrogen solubility in acyclic alkanes increases with increasing carbon number.Whereas in polycyclic aromatics,it decreases with increasing ring number,and in hydrogenated aromatics,it increases with hydrogenation degree.This work provides a new reference for the selection and proportioning of DCLS,i.e.,a solvent with higher hydrogen solubility can be added to provide active hydrogen for the reaction and thus reduce the hydrogen pressure.Besides,it brings important insight into the theoretical significance and practical value of the DCL.展开更多
Due to the large number of ionic liquids (ILs) and their potential environmental risk, assessing the toxicity of ILs by ecotoxicological experiment only is insufficient. Quantitative structure- activity relationship...Due to the large number of ionic liquids (ILs) and their potential environmental risk, assessing the toxicity of ILs by ecotoxicological experiment only is insufficient. Quantitative structure- activity relationship (QSAR) has been proven to be a quick and effective method to estimate the viscosity, melting points, and even toxicity of ILs. In this work, the LC50 values of 30 imidazolium-based ILs were determined with Caenorhabditis elegans as a model animal. Four suitable molecular descriptors were selected on the basis of genetic function approximation algorithm to construct a QSAR model with an R^2 value of 0.938. The predicted lgLC50 in this work are in agreement with the experimental values, indicating that the model has good stability and predictive ability. Our study provides a valuable model to predict the potential toxicity of ILs with different sub-structures to the environment and human health.展开更多
The Gated Recurrent Unit(GRU) neural network has great potential in estimating and predicting a variable. In addition to radar reflectivity(Z), radar echo-top height(ET) is also a good indicator of rainfall rate(R). I...The Gated Recurrent Unit(GRU) neural network has great potential in estimating and predicting a variable. In addition to radar reflectivity(Z), radar echo-top height(ET) is also a good indicator of rainfall rate(R). In this study, we propose a new method, GRU_Z-ET, by introducing Z and ET as two independent variables into the GRU neural network to conduct the quantitative single-polarization radar precipitation estimation. The performance of GRU_Z-ET is compared with that of the other three methods in three heavy rainfall cases in China during 2018, namely, the traditional Z-R relationship(Z=300R1.4), the optimal Z-R relationship(Z=79R1.68) and the GRU neural network with only Z as the independent input variable(GRU_Z). The results indicate that the GRU_Z-ET performs the best, while the traditional Z-R relationship performs the worst. The performances of the rest two methods are similar.To further evaluate the performance of the GRU_Z-ET, 200 rainfall events with 21882 total samples during May–July of 2018 are used for statistical analysis. Results demonstrate that the spatial correlation coefficients, threat scores and probability of detection between the observed and estimated precipitation are the largest for the GRU_Z-ET and the smallest for the traditional Z-R relationship, and the root mean square error is just the opposite. In addition, these statistics of GRU_Z are similar to those of optimal Z-R relationship. Thus, it can be concluded that the performance of the GRU_ZET is the best in the four methods for the quantitative precipitation estimation.展开更多
Quantitative structure-biodegradability relationships (QSBRs) were established to develop predictive models and mechanistic explanations for acid dyestuffs as well as biological activities. With a total of four desc...Quantitative structure-biodegradability relationships (QSBRs) were established to develop predictive models and mechanistic explanations for acid dyestuffs as well as biological activities. With a total of four descriptors, molecular weight (MW), energies of the highest occupied molecular orbital (EHOMO), the lowest unoccupied molecular orbital (ELUMO), and the excited state (EES), calculated using quantum chemical semi-empirical methodology, a series of models were analyzed between the dye biodegradability and each descriptor. Results showed that EHOMO and Mw were the dominant parameters controlling the biodegradability of acid dyes. A statistically robust QSBR model was developed for all studied dyes, with the combined application of EHOMO and Mw. The calculated biodegradations fitted well with the experimental data monitored in a facultative-aerobic process, indicative of the reliable prediction and mechanistic character of the developed model.展开更多
An investigation was carried out in the Huanghai Sea and the East China Sea to study the quantitative relationship between the abundance of flagellates and the density of suspended particles in the summer of 2001. The...An investigation was carried out in the Huanghai Sea and the East China Sea to study the quantitative relationship between the abundance of flagellates and the density of suspended particles in the summer of 2001. The results show that the abundance of flagellates varies from 44-12 600 cell/cm^3, and flagellates sometimes constitutes a significant part of suspended particles. The size-spectra of suspended particles can be divided into four categories: flat spectrum, humped spectrum, plankton spectrum and mixed spectrum. In general, the abundance of flagellates varies in proportion to the density of suspended particles. However, their quantitative relations reveal different characteristics in the seawater samples of different types of particle-size spectrum. This is only a preliminary study of the quantitative relationship between flagellates and suspended particles, which might lead to a potential convenient approach to the estimation of flagellate abundance in the sea.展开更多
The molecular electronegativity interaction vector (MEIV) was used to describe the molecular structure of 30 selected esters. Two excellent QSTR models were built up by using multiple linear regression (MLR) and p...The molecular electronegativity interaction vector (MEIV) was used to describe the molecular structure of 30 selected esters. Two excellent QSTR models were built up by using multiple linear regression (MLR) and partial least-squares regression (PLS). The correlation coefficients (R) of the two models were 0.945 and 0.941, respectively. The models were evaluated by performing the cross validation with the leave-one-out (LOO) procedure. The cross-verification correlation coefficients (RCV) of the two models were 0.921 and 0.919, respectively. The results showed that the models constructed in this work could provide estimation stability and favorable predictive ability.展开更多
A new set of descriptors, HSEHPCSV (component score vector of hydrophobic, steric, and electronic properties together with hydrogen bonding contributions), were derived from principal component analyses of 95 physic...A new set of descriptors, HSEHPCSV (component score vector of hydrophobic, steric, and electronic properties together with hydrogen bonding contributions), were derived from principal component analyses of 95 physicochemical variables of 20 natural amino acids separately according to different kinds of properties described, namely, hydrophobic, steric, and electronic properties as well as hydrogen bonding contributions. HSEHPCSV scales were then employed to express structures of angiotensin-converting enzyme inhibitors, bitter tasting thresholds and bactericidal 18 peptide, and to construct QSAR models based on partial least square (PLS). The results obtained are as follows: the multiple correlation coefficient (R2cum) of 0.846, 0.917 and 0.993, leave-one-out cross validated Q2cm of 0.835, 0.865 and 0.899, and root-mean-square error for estimated error (RMSEE) of 0.396, 0.187and 0.22, respectively. Satisfactory results showed that, as new amino acid scales, data of HSEHPCSV may be a useful structural expression methodology'for the studies on peptide QSAR (quantitative structure-activity relationship) due to many advantages such as plentiful structural information, definite physical and chemical meaning and easy interpretation.展开更多
Carotenoids are a family of effective active oxygen scavengers, which can reduce the danger of occurrence of chronic diseases such as cardiovascular disease, cataract, cancer, and so on. The quantitative structure-act...Carotenoids are a family of effective active oxygen scavengers, which can reduce the danger of occurrence of chronic diseases such as cardiovascular disease, cataract, cancer, and so on. The quantitative structure-activity relationship (QSAR) equation between carotenoids and antioxidant activity was established by quantum chemistry AM1, molecular mechanism (MM+) and stepwise regression analysis methods, and the model was evaluated by leave-one-out approach. The results showed that the significant molecular descriptors related to the antioxidant activity of carotenoids were the energy difference (E_HL) between the lowest unoccupied molecular orbital (LUMO) and the highest occupied molecular orbital (HOMO) and ionization energy (Eiso). The model showed a good predictive ability (Q^2 〉 0.5).展开更多
With the artificial neural network(ANN) method combined with the multiple linear regression(MLR),based on a series of quantum chemical descriptors and molecular connectivity indexes,quantitative structure-activity...With the artificial neural network(ANN) method combined with the multiple linear regression(MLR),based on a series of quantum chemical descriptors and molecular connectivity indexes,quantitative structure-activity relationship(QSAR) models to predict the acute toxicity(-lgEC50) of substituted aromatic compounds to Photobacterium phosphoreum were established.Four molecular descriptors that appear in the MLR model,namely,the second order valence molecular connectivity index(2XV),the energy of the highest occupied molecular orbital(EHOMO),the logarithm of n-octyl alcohol/water partition coefficient(logKow) and the Connolly molecular area(MA),were inputs of the ANN model.The root-mean-square error(RMSE) of the training and validation sets of the ANN model are 0.1359 and 0.2523,and the correlation coefficient(R) is 0.9810 and 0.8681,respectively.The leave-one-out(LOO) cross validated correlation coefficient(Q L2OO) of the MLR and ANN models is 0.6954 and 0.6708,respectively.The result showed that the two methods are complementary in the calculations.The regression method gave support to the neural network with physical explanation,and the neural network method gave a more accurate model for QSAR.In addition,some insights into the structural factors affecting the acute toxicity and toxicity mechanism of substituted aromatic compounds were discussed.展开更多
Many structure-property/activity studies use graph theoretical indices, which are based on the topological properties of a molecule viewed as a graph. Since topological indices can be derived directly from the molecul...Many structure-property/activity studies use graph theoretical indices, which are based on the topological properties of a molecule viewed as a graph. Since topological indices can be derived directly from the molecular structure without any experimental effort, they provide a simple and straightforward method for property prediction. In this work the flash point of alkanes was modeled by a set of molecular connectivity indices (Х), modified molecular connectivity indices ( ^mХ^v ) and valance molecular connectivity indices ( ^mХ^v ), with ^mХ^v calculated using the hydrogen perturbation. A stepwise Multiple Linear Regression (MLR) method was used to select the best indices. The predicted flash points are in good agreement with the experimental data, with the average absolute deviation 4.3 K.展开更多
The genotoxicity of 22 substituted nitrobenzenes were evaluated by the chromosome aberrations test in in vitro human peripheral lymphocytes.18 of 22 compounds exhibit genotoxic activities.Quantitative structure-activi...The genotoxicity of 22 substituted nitrobenzenes were evaluated by the chromosome aberrations test in in vitro human peripheral lymphocytes.18 of 22 compounds exhibit genotoxic activities.Quantitative structure-activity relationship model was established to correlate the genotoxicity of substituted nitrobenzenes with the characteristics of the substituents on benzene ring.展开更多
A novel quantitative structure-property relationship (QSPR) model for estimating the solution surface tension of 92 organic compounds at 20℃ was developed based on newly introduced atom-type topological indices. Th...A novel quantitative structure-property relationship (QSPR) model for estimating the solution surface tension of 92 organic compounds at 20℃ was developed based on newly introduced atom-type topological indices. The data set contained non-polar and polar liquids, and saturated and unsaturated compounds. The regression analysis shows that excellent result is obtained with multiple linear regression. The predictive power of the proposed model was discussed using the leave-one-out (LOO) cross-validated (CV) method. The correlation coefficient (R) and the leave-one-out cross-validation correlation coefficient (Rcv) of multiple linear regression model are 0.991 4 and 0.991 3, respectively. The new model gives the average absolute relative deviation of 1.81% for 92 substances. The result demonstrates that novel topological indices based on the equilibrium electro-negativity of atom and the relative bond length are useful model parameters for QSPR analysis of compounds.展开更多
AIM: To study the relationship between quantitative structure and pharmacokinetics (QSPkR) of fluorocluinolone antibacterials.METHODS: The pharmacokinetic (PK) parameters of oral fluoroquinolones were collected ...AIM: To study the relationship between quantitative structure and pharmacokinetics (QSPkR) of fluorocluinolone antibacterials.METHODS: The pharmacokinetic (PK) parameters of oral fluoroquinolones were collected from the literature. These pharmacokinetic data were averaged, 19 compounds were used as the training set, and 3 served as the test set. Genetic function approximation (GFA) module of Cerius2 software was used in QSPkR analysis.RESULTS: A small volume and large polarizability and surface area of substituents at C-7 contribute to a large area under the curve (AUC) for fluoroquinolones. Large polarizability and small volume of substituents at N-1 contribute to a long half life elimination.CONCLUSION: QSPkR models can contribute to some fluoroquinolones antibacterials with excellent pharmacokinetic properties.展开更多
基金Supported by Shanghai Education Committee Project (No. 11YZ224)Shanghai Leading Academic Discipline Project (No. J51503)
文摘In the present study,(QSRR) study had been carried out for volatile components from Rosa banksiae Ait.based on various quantum-chemical and physicochemical descriptors derived by B3LYP method.To build QSRR models,a multiple linear regression (MLR) stepwise method was used.The generated models have good predictive ability and are of high statistical significance with good correlation coefficients (R2≥0.734) and p values far less than 0.05.Preliminary results indicated that the application of the models,especially the prediction of GC retention time and linear retention index of volatile components from Rosa banksiae Ait.,will be helpful.The models contribute also to the identification of important quantum-chemical and physicochemical descriptors responsible for the retention time and linear retention index.It was found that the shape attribute (ShpA) and logP value play a vital role in determining component’s GC retention time and linear retention index which increase with the lipophilicity of volatile components.The larger the shape attribute of analyte is,the larger the deformability is,the stronger the interaction between analyte and stationary phase is,and the longer the GC retention time is,the larger the linear retention index is.The importance of E HOMO,q+,and SEV is also embodied in models,but they are not dominant.
基金supported by the Foundation of Returned Scholars (Main Program) of Shanxi Province (200902)
文摘Polychlorinated dibenzothiophenes(PCDTs) are classified as persistent organic pollutants in the environment,so the analysis of PCDTs by their gas chromatographic behaviors is of great significance.Quantitative structure-retention relationship(QSRR) analysis is a useful technique capable of relating chromatographic retention time to the molecular structure.In this paper,a QSRR study of 37 PCDTs was carried out by using molecular electronegativity distance vector(MEDV) descriptors and multiple linear regression(MLR) and partial least-squares regression(PLS) methods.The correlation coefficient R of established MLR,PLS models,leave-one-out(LOO) cross-validation(CV),Q2ext were 0.9951,0.9942,0.9839(MLR) and 0.9925,0.9915,0.9833(PLS),respectively.Results showed that the model exhibited excellent estimate capability for internal sample set and good predictive capability for external sample set.By using MEDV descriptors,the QSRR model can provide a simple and rapid way to predict the gas-chromatographic retention indices of polychlorinated dibenzothiophenes in conditions of lacking standard samples or poor experimental conditions.
基金supported by the National Natural Science Foundation of China (10901169)National 111 Programme of Introducing Talents of Discipline to Universities (0507111106)+2 种基金Innovation Ability Training Foundation of Chongqing University (CDCX008)Innovative Group Program for Graduates of Chongqing University,ScienceInnovation Fund (200711C1A0010260)
文摘An integrated approach is proposed to predict the chromatographic retention time of oligonucleotides based on quantitative structure-retention relationships (QSRR) models. First, the primary base sequences of oligonucleotides are translated into vectors based on scores of generalized base properties (SGBP), involving physicochemical, quantum chemical, topological, spatial structural properties, etc.; thereafter, the sequence data are transformed into a uniform matrix by auto cross covariance (ACC). ACC accounts for the interactions between bases at a certain distance apart in an oligonucleotide sequence; hence, this method adequately takes the neighboring effect into account. Then, a genetic algorithm is used to select the variables related to chromatographic retention behavior of oligonuclcotides. Finally, a support vector machine is used to develop QSRR models to predict chromatographic retention behavior. The whole dataset is divided into pairs of training sets and test sets with different proportions; as a result, it has been found that the QSRR models using more than 26 training samples have an appropriate external power, and can accurately represent the relationship between the features of sequences and structures, and the retention times. The results indicate that the SGBP-ACC approach is a useful structural representation method in QSRR of oligonucleotides due to its many advantages such as plentiful structural information, easy manipulation and high characterization competence. Moreover, the method can further be applied to predict chromatographic retention behavior of oligonucleotides.
基金supported by the Industry Innovation Foundation of Shanxi Province(Grant No.2006031204)the Chongqing Applied Fundamental Science Foundation(Grant No.01-3-6).
文摘A newly developed descriptor, three- dimensional holographic vector of atomic interaction field (3D-HoVAIF), was used to describe the chemical structures of purine bases. After variable screening by stepwise multiple regression (SMR) technique, a partial least square (PLS) regression model was built with 3D-HoVAIF. The model was satisfactory com- paring to reference since correlation coefficients of molecular modeling ( Rc 2um), cross- validation ( Qc 2um) and standard deviation of estimation (SD) were 0.966, 0.860 and 0.112, respectively, showing that the model had favorable estimation and prediction capa- bilities. It was illustrated that information related to retention data of purine bases could preferably be expressed by 3D-HoVAIF with definite physico- chemical meanings and easy structural interpretation for purine bases. It was illustrated that 3D-HoVAIF was to preferably express retention data of purine bases and had definite physicochemical significance. So 3D-HoVAIF was a useful structural expression technique for quantitative structure activity (or prop- erty or retention) relationships (QSAR/QSPR/QSRR) study, such as structural characterization and chro- matographic retention prediction.
基金This work was supported by the Chunhui Project Fund of the Ministry of Education(Grant No.SCPF99-4-4+37)Fok Ying-Tung Educational Foundation(Grant No.FYTF98-7-6)+1 种基金Chongqing Applied Science Fund(Grant No,CASF01-3-6)Chongqing University ZYXT Innovation Fund(Grant No.CUIF03-5-6+04-10-10).
文摘Based on the identical group as a pseudo atom instead of a typical atom, a novel modified molecular dis-tance-edge (MDE) vector μ was developed in our laboratory to characterize chemical structure of polychlorinated diben-zofurans (PCDFs) congeners and/or isomers. Quantitative structure-retention relationships (QSRRs) between the new VMDE parameters and gas chromatographic (GC) retention behavior of PCDFs were then generated by multiple linear regression (MLR) method for non-polar, moderately polar, and polar stationary phases. Four excellent models with high correlation coefficients, R=0.984-0.995, were proposed for non-polar columns (DB-5, SE-54, OV-101). For the moder-ately polar columns (OV-1701), the correlation coefficient of the developed good model is only 0.958. For the polar col-umns (SP-2300), the QSRR model is poor with R=0.884. Then cross validation with leave-one out of procedure (CV) is performed in high correlation with the non-polar (Rcv=992-0.974) and weakly polar (Rcv=921) columns and in little cor-relation (Rcv=0.834) with the polar columns. These results show that the new μ vector is suitable for describing the re-tention behaviors of PCDFs on non-polar and moderately polar stationary phases and not for the various gas chroma-tographic retention behaviors of PCDFs on the different po-larity-varying stationary phases.
基金Supported by the National Natural Science Foundation of China (Grant No. 40601085)National Key Technology R&D Program (Grant No.2008BADA7B01)
文摘A new quantitative structure-retention relationship (QSRR) model is developed for polychlorinated dibenzofurans (PCDFs) based on molecular interaction field (MIF) analysis. The MIF of all 135 PCDFs is calculated using DRY, C1= and C3 probe, characterizing the hydrophobic and steric interaction between PCDFs and different groups of stationary phase. Then QSRR model is constructed by multiblock partial least squares (MBPLS), and the significance of each block is evaluated by the block importance in the prediction (BIP) method. The model used for prediction is statistically significant, with calibration and cross-validation correlation coefficients 0.9990 and 0.9980 respectively, and relative error less than 1.0%. The results of MBPLS and BIP show that the steric properties have dominant influence on the retention behavior of PCDFs, and then the hydrophobic effects.
基金supported by the National Natural Science Foundation of China Youth Fund(41806109)the project of the China Geological Survey(DD20189506)。
文摘To study the quantitative relationship between surface sedimentary diatoms and water depth,67 surface samples were collected for diatom analysis on eight profiles with water depth variation from the muddy intertidal zone to the shallow sea area in North-Central Bohai Bay,China.The results showed that the distribution of diatoms changed significantly in response to the change in water depth.Furthermore,the quantitative relationship between the distribution of dominant diatom species,their assemblages,and the water depth was established.The water depth optima for seven dominant species such as Cyclotella striata/stylorum,Paralia sulcata,and Coscinodiscus perforatus and the water depth indication range of seven diatom assemblages were obtained in the study area above the water depth(elevation)of-10 m.The quantitative relationship between surface sedimentary diatoms and water depth provides a proxy index for diatom-paleo-water depth reconstruction in the strata in Bohai Bay,China.
基金the financial support from the National Key Research and Development Program of China(2022YFB4101302-01)the National Natural Science Foundation of China(22178243)the science and technology innovation project of China Shenhua Coal to Liquid and Chemical Company Limited(MZYHG-22-02).
文摘Direct coal liquefaction(DCL)is an important and effective method of converting coal into high-valueadded chemicals and fuel oil.In DCL,heating the direct coal liquefaction solvent(DCLS)from low to high temperature and pre-hydrogenation of the DCLS are critical steps.Therefore,studying the dissolution of hydrogen in DCLS under liquefaction conditions gains importance.However,it is difficult to precisely determine hydrogen solubility only by experiments,especially under the actual DCL conditions.To address this issue,we developed a prediction model of hydrogen solubility in a single solvent based on the machine-learning quantitative structure–property relationship(ML-QSPR)methods.The results showed that the squared correlation coefficient R^(2)=0.92 and root mean square error RMSE=0.095,indicating the model’s good statistical performance.The external validation of the model also reveals excellent accuracy and predictive ability.Molecular polarization(a)is the main factor affecting the dissolution of hydrogen in DCLS.The hydrogen solubility in acyclic alkanes increases with increasing carbon number.Whereas in polycyclic aromatics,it decreases with increasing ring number,and in hydrogenated aromatics,it increases with hydrogenation degree.This work provides a new reference for the selection and proportioning of DCLS,i.e.,a solvent with higher hydrogen solubility can be added to provide active hydrogen for the reaction and thus reduce the hydrogen pressure.Besides,it brings important insight into the theoretical significance and practical value of the DCL.
基金This work was supported by the National Natural Science Foundation of China (No.21477121), and the Fundamental Research Funds for the Central Universities for the support of this work. The numerical calculations were performed on the super computing system in the Supercomputing Center at the University of Science and Technology of China.
文摘Due to the large number of ionic liquids (ILs) and their potential environmental risk, assessing the toxicity of ILs by ecotoxicological experiment only is insufficient. Quantitative structure- activity relationship (QSAR) has been proven to be a quick and effective method to estimate the viscosity, melting points, and even toxicity of ILs. In this work, the LC50 values of 30 imidazolium-based ILs were determined with Caenorhabditis elegans as a model animal. Four suitable molecular descriptors were selected on the basis of genetic function approximation algorithm to construct a QSAR model with an R^2 value of 0.938. The predicted lgLC50 in this work are in agreement with the experimental values, indicating that the model has good stability and predictive ability. Our study provides a valuable model to predict the potential toxicity of ILs with different sub-structures to the environment and human health.
基金jointly supported by the National Science Foundation of China (Grant Nos. 42275007 and 41865003)Jiangxi Provincial Department of science and technology project (Grant No. 20171BBG70004)。
文摘The Gated Recurrent Unit(GRU) neural network has great potential in estimating and predicting a variable. In addition to radar reflectivity(Z), radar echo-top height(ET) is also a good indicator of rainfall rate(R). In this study, we propose a new method, GRU_Z-ET, by introducing Z and ET as two independent variables into the GRU neural network to conduct the quantitative single-polarization radar precipitation estimation. The performance of GRU_Z-ET is compared with that of the other three methods in three heavy rainfall cases in China during 2018, namely, the traditional Z-R relationship(Z=300R1.4), the optimal Z-R relationship(Z=79R1.68) and the GRU neural network with only Z as the independent input variable(GRU_Z). The results indicate that the GRU_Z-ET performs the best, while the traditional Z-R relationship performs the worst. The performances of the rest two methods are similar.To further evaluate the performance of the GRU_Z-ET, 200 rainfall events with 21882 total samples during May–July of 2018 are used for statistical analysis. Results demonstrate that the spatial correlation coefficients, threat scores and probability of detection between the observed and estimated precipitation are the largest for the GRU_Z-ET and the smallest for the traditional Z-R relationship, and the root mean square error is just the opposite. In addition, these statistics of GRU_Z are similar to those of optimal Z-R relationship. Thus, it can be concluded that the performance of the GRU_ZET is the best in the four methods for the quantitative precipitation estimation.
基金Project supported by the Natural Science Foundation of Shanghai, China(No. 06ZR14002).
文摘Quantitative structure-biodegradability relationships (QSBRs) were established to develop predictive models and mechanistic explanations for acid dyestuffs as well as biological activities. With a total of four descriptors, molecular weight (MW), energies of the highest occupied molecular orbital (EHOMO), the lowest unoccupied molecular orbital (ELUMO), and the excited state (EES), calculated using quantum chemical semi-empirical methodology, a series of models were analyzed between the dye biodegradability and each descriptor. Results showed that EHOMO and Mw were the dominant parameters controlling the biodegradability of acid dyes. A statistically robust QSBR model was developed for all studied dyes, with the combined application of EHOMO and Mw. The calculated biodegradations fitted well with the experimental data monitored in a facultative-aerobic process, indicative of the reliable prediction and mechanistic character of the developed model.
文摘An investigation was carried out in the Huanghai Sea and the East China Sea to study the quantitative relationship between the abundance of flagellates and the density of suspended particles in the summer of 2001. The results show that the abundance of flagellates varies from 44-12 600 cell/cm^3, and flagellates sometimes constitutes a significant part of suspended particles. The size-spectra of suspended particles can be divided into four categories: flat spectrum, humped spectrum, plankton spectrum and mixed spectrum. In general, the abundance of flagellates varies in proportion to the density of suspended particles. However, their quantitative relations reveal different characteristics in the seawater samples of different types of particle-size spectrum. This is only a preliminary study of the quantitative relationship between flagellates and suspended particles, which might lead to a potential convenient approach to the estimation of flagellate abundance in the sea.
基金supported by the Youth Foundation of Education Bureau, Sichuan Province (09ZB036)Technology Bureau, Sichuan Province (2006j13-141)
文摘The molecular electronegativity interaction vector (MEIV) was used to describe the molecular structure of 30 selected esters. Two excellent QSTR models were built up by using multiple linear regression (MLR) and partial least-squares regression (PLS). The correlation coefficients (R) of the two models were 0.945 and 0.941, respectively. The models were evaluated by performing the cross validation with the leave-one-out (LOO) procedure. The cross-verification correlation coefficients (RCV) of the two models were 0.921 and 0.919, respectively. The results showed that the models constructed in this work could provide estimation stability and favorable predictive ability.
基金Supported by the National High Technology Research and Development Program of China (863 Program, No. 2006AA02Z312)
文摘A new set of descriptors, HSEHPCSV (component score vector of hydrophobic, steric, and electronic properties together with hydrogen bonding contributions), were derived from principal component analyses of 95 physicochemical variables of 20 natural amino acids separately according to different kinds of properties described, namely, hydrophobic, steric, and electronic properties as well as hydrogen bonding contributions. HSEHPCSV scales were then employed to express structures of angiotensin-converting enzyme inhibitors, bitter tasting thresholds and bactericidal 18 peptide, and to construct QSAR models based on partial least square (PLS). The results obtained are as follows: the multiple correlation coefficient (R2cum) of 0.846, 0.917 and 0.993, leave-one-out cross validated Q2cm of 0.835, 0.865 and 0.899, and root-mean-square error for estimated error (RMSEE) of 0.396, 0.187and 0.22, respectively. Satisfactory results showed that, as new amino acid scales, data of HSEHPCSV may be a useful structural expression methodology'for the studies on peptide QSAR (quantitative structure-activity relationship) due to many advantages such as plentiful structural information, definite physical and chemical meaning and easy interpretation.
基金Supported by the Chinese National Key Technologies R & D Program of 11th Five-year Plan (2006BAD27B06)Education Foundation of Innovative Engineering Key Project of Education Department (707034)
文摘Carotenoids are a family of effective active oxygen scavengers, which can reduce the danger of occurrence of chronic diseases such as cardiovascular disease, cataract, cancer, and so on. The quantitative structure-activity relationship (QSAR) equation between carotenoids and antioxidant activity was established by quantum chemistry AM1, molecular mechanism (MM+) and stepwise regression analysis methods, and the model was evaluated by leave-one-out approach. The results showed that the significant molecular descriptors related to the antioxidant activity of carotenoids were the energy difference (E_HL) between the lowest unoccupied molecular orbital (LUMO) and the highest occupied molecular orbital (HOMO) and ionization energy (Eiso). The model showed a good predictive ability (Q^2 〉 0.5).
基金supported by the Natural Science Foundation of Fujian Province (D0710019)the Natural Science Foundation of Overseas Chinese Affairs Office of the State Council (06QZR09)
文摘With the artificial neural network(ANN) method combined with the multiple linear regression(MLR),based on a series of quantum chemical descriptors and molecular connectivity indexes,quantitative structure-activity relationship(QSAR) models to predict the acute toxicity(-lgEC50) of substituted aromatic compounds to Photobacterium phosphoreum were established.Four molecular descriptors that appear in the MLR model,namely,the second order valence molecular connectivity index(2XV),the energy of the highest occupied molecular orbital(EHOMO),the logarithm of n-octyl alcohol/water partition coefficient(logKow) and the Connolly molecular area(MA),were inputs of the ANN model.The root-mean-square error(RMSE) of the training and validation sets of the ANN model are 0.1359 and 0.2523,and the correlation coefficient(R) is 0.9810 and 0.8681,respectively.The leave-one-out(LOO) cross validated correlation coefficient(Q L2OO) of the MLR and ANN models is 0.6954 and 0.6708,respectively.The result showed that the two methods are complementary in the calculations.The regression method gave support to the neural network with physical explanation,and the neural network method gave a more accurate model for QSAR.In addition,some insights into the structural factors affecting the acute toxicity and toxicity mechanism of substituted aromatic compounds were discussed.
文摘Many structure-property/activity studies use graph theoretical indices, which are based on the topological properties of a molecule viewed as a graph. Since topological indices can be derived directly from the molecular structure without any experimental effort, they provide a simple and straightforward method for property prediction. In this work the flash point of alkanes was modeled by a set of molecular connectivity indices (Х), modified molecular connectivity indices ( ^mХ^v ) and valance molecular connectivity indices ( ^mХ^v ), with ^mХ^v calculated using the hydrogen perturbation. A stepwise Multiple Linear Regression (MLR) method was used to select the best indices. The predicted flash points are in good agreement with the experimental data, with the average absolute deviation 4.3 K.
文摘The genotoxicity of 22 substituted nitrobenzenes were evaluated by the chromosome aberrations test in in vitro human peripheral lymphocytes.18 of 22 compounds exhibit genotoxic activities.Quantitative structure-activity relationship model was established to correlate the genotoxicity of substituted nitrobenzenes with the characteristics of the substituents on benzene ring.
基金Projects(20775010,21075011) supported by the National Natural Science Foundation of ChinaProject(2008AA05Z405) supported by the National High Technology Research and Development Program of China+2 种基金Project(09JJ3016) supported by Hunan Provincial Natural Science Foundation,ChinaProject(09C066) supported by Scientific Research Fund of Hunan Provincial Education Department,ChinaProject(2010CL01) supported by the Foundation of Hunan Provincial Key Laboratory of Materials Protection for Electric Power and Transportation,China
文摘A novel quantitative structure-property relationship (QSPR) model for estimating the solution surface tension of 92 organic compounds at 20℃ was developed based on newly introduced atom-type topological indices. The data set contained non-polar and polar liquids, and saturated and unsaturated compounds. The regression analysis shows that excellent result is obtained with multiple linear regression. The predictive power of the proposed model was discussed using the leave-one-out (LOO) cross-validated (CV) method. The correlation coefficient (R) and the leave-one-out cross-validation correlation coefficient (Rcv) of multiple linear regression model are 0.991 4 and 0.991 3, respectively. The new model gives the average absolute relative deviation of 1.81% for 92 substances. The result demonstrates that novel topological indices based on the equilibrium electro-negativity of atom and the relative bond length are useful model parameters for QSPR analysis of compounds.
基金the National Basic Research Program of China,No. 2004BC518902
文摘AIM: To study the relationship between quantitative structure and pharmacokinetics (QSPkR) of fluorocluinolone antibacterials.METHODS: The pharmacokinetic (PK) parameters of oral fluoroquinolones were collected from the literature. These pharmacokinetic data were averaged, 19 compounds were used as the training set, and 3 served as the test set. Genetic function approximation (GFA) module of Cerius2 software was used in QSPkR analysis.RESULTS: A small volume and large polarizability and surface area of substituents at C-7 contribute to a large area under the curve (AUC) for fluoroquinolones. Large polarizability and small volume of substituents at N-1 contribute to a long half life elimination.CONCLUSION: QSPkR models can contribute to some fluoroquinolones antibacterials with excellent pharmacokinetic properties.