The computational approaches of support vector machine (SVM), support vector regression (SVR) and molecular docking were widely utilized for the computation of active compounds. In this work, to improve the accura...The computational approaches of support vector machine (SVM), support vector regression (SVR) and molecular docking were widely utilized for the computation of active compounds. In this work, to improve the accuracy and reliability of prediction, the strategy of combining the above three computational approaches was applied to predict potential cytochrome P450 1A2 (CYP1A2) inhibitors. The accuracy of the optimal SVM qualitative model was 99.432%, 97.727%, and 91.667% for training set, internal test set and external test set, respectively, showing this model had high discrimination ability. The R2 and mean square error for the optimal SVR quantitative model were 0.763, 0.013 for training set, and 0.753, 0.056 for test set respectively, indicating that this SVR model has high predictive ability for the biolog-ical activities of compounds. According to the results of the SVM and SVR models, some types of descriptors were identi ed to be essential to bioactivity prediction of compounds, including the connectivity indices, constitutional descriptors and functional group counts. Moreover, molecular docking studies were used to reveal the binding poses and binding a n-ity of potential inhibitors interacting with CYP1A2. Wherein, the amino acids of THR124 and ASP320 could form key hydrogen bond interactions with active compounds. And the amino acids of ALA317 and GLY316 could form strong hydrophobic bond interactions with active compounds. The models obtained above were applied to discover potential CYP1A2 inhibitors from natural products, which could predict the CYPs-mediated drug-drug inter-actions and provide useful guidance and reference for rational drug combination therapy. A set of 20 potential CYP1A2 inhibitors were obtained. Part of the results was consistent with references, which further indicates the accuracy of these models and the reliability of this combinatorial computation strategy.展开更多
Atoms in most organic molecules are often carbon,oxygen,nitrogen,sulfur,halogens,etc. Based on the three-dimensional structure of a molecule,a molecular structural characterization(MSC) method called improved molecu...Atoms in most organic molecules are often carbon,oxygen,nitrogen,sulfur,halogens,etc. Based on the three-dimensional structure of a molecule,a molecular structural characterization(MSC) method called improved molecular electronegativity-distance vector(I-MEDV) was developed. It was used to describe the structures of 37 compounds of styrax japonicus sieb flowers. Through multiple linear regression(MLR),a QSRR model was built up. The correlation coefficient(R1) of the model was 0.980. Then,4 vectors were selected to build another model through the method of stepwise multiple regression(SMR) ,and the correlation coefficient(R2) of the model was 0.975. Moreover,all the two models were evaluated by performing the crossvalidation with the leave-one-out(LOO) procedure and the correlation coefficients(Rcv) were 0.948 and 0.968,respectively. The results show that the I-MEDV could successfully describe the structures of organic compounds. The stability and predictability of the models were good.展开更多
Polychlorinated dibenzothiophenes(PCDTs) are classified as persistent organic pollutants in the environment,so the analysis of PCDTs by their gas chromatographic behaviors is of great significance.Quantitative struc...Polychlorinated dibenzothiophenes(PCDTs) are classified as persistent organic pollutants in the environment,so the analysis of PCDTs by their gas chromatographic behaviors is of great significance.Quantitative structure-retention relationship(QSRR) analysis is a useful technique capable of relating chromatographic retention time to the molecular structure.In this paper,a QSRR study of 37 PCDTs was carried out by using molecular electronegativity distance vector(MEDV) descriptors and multiple linear regression(MLR) and partial least-squares regression(PLS) methods.The correlation coefficient R of established MLR,PLS models,leave-one-out(LOO) cross-validation(CV),Q2ext were 0.9951,0.9942,0.9839(MLR) and 0.9925,0.9915,0.9833(PLS),respectively.Results showed that the model exhibited excellent estimate capability for internal sample set and good predictive capability for external sample set.By using MEDV descriptors,the QSRR model can provide a simple and rapid way to predict the gas-chromatographic retention indices of polychlorinated dibenzothiophenes in conditions of lacking standard samples or poor experimental conditions.展开更多
A set of novel structural descriptors (molecular hybridization electronegativity-distance vector, VMEDh) was put forward, and the quantitative structure–activity relationship (QSAR) of a series of 17α-Acetoxyprogest...A set of novel structural descriptors (molecular hybridization electronegativity-distance vector, VMEDh) was put forward, and the quantitative structure–activity relationship (QSAR) of a series of 17α-Acetoxyprogesterones (APs) was investigated. Taking into account the effect of various hybridized orbits on atomic electronegativities, we developed the structure descriptors with amended electronegativities to build a QSAR model. The 10-parameter model based on VMEDh yields a correlation coefficient R=0.972 and standard deviation SD=0.262, which are more desirable than those of the previous molecular electonegativity-distance vector (MEDV-4) (R=0.969, SD=0.275). By stepwise multiple linear regression, several parameters are selected to construct optimal models. The 7-parameter model based on VMEDh has R=0.960 and SD=0.276; its correlation coefficient (RCV) and standard deviation (SDCV) for leave-one-out procedure crossvalidation are respectively RCV=0.890 and SDCV=0.445. The 6-parameter MEDV-4 model has R=0.946, SD=0.304, RCV=0.903 and SDCV=0.406. It is demonstrated that VMEDh has desirable estimation performance and good predictive capability for this series of chemical compounds.展开更多
A cationic gene delivery vector, guanidinylated disulfide-containing poly(amido amine)(CARCBA), was synthesized by Michael addition reaction between N,N′-cystaminebisacrylamide(CBA) and guanidine hydrochloride(CAR). ...A cationic gene delivery vector, guanidinylated disulfide-containing poly(amido amine)(CARCBA), was synthesized by Michael addition reaction between N,N′-cystaminebisacrylamide(CBA) and guanidine hydrochloride(CAR). Gel permeation chromatography(GPC) was used to evaluate the molecular weight of synthesized CAR-CBA. Polyethyleneimine(PEI) with molecular weight of 25 kDa was adopted as a reference, and polyethylene glycols(PEG) with different molecular weights were used to establish a standard curve for determining the molecular weight of CAR-CBA. The effects of two critical factors, namely columns and eluents,on the molecular weight measurement of CAR-CBA were investigated to optimize the GPC quantitative method. The results showed that Ultrahydrogel columns(120, 250) and HAc–NaAc(0.5 M, pH 4.5) buffer solution were the optimal column and GPC eluent, respectively.The molecular weight of the synthesized CAR-CBA was analyzed by the optimized GPC method and determined to be 24.66 kDa.展开更多
Molecular frame photoemission is a very sensitive probe of the photoionization (PI) dynamics of molecules. This paper reports a comparative study of non-resonant and resonant photoionization of D2 induced by VUV cir...Molecular frame photoemission is a very sensitive probe of the photoionization (PI) dynamics of molecules. This paper reports a comparative study of non-resonant and resonant photoionization of D2 induced by VUV circularly polarized synchrotron radiation at SOLEIL at the level of the molecular frame photoelectron angular distributions (MFPADs). We use the vector correlation method which combines imaging and time-of-flight resolved electron-ion coincidence techniques, and a generalized formalism for the expression of the Ⅰ(χ, θe, Фe) MFPADs, where χ is the orientation of the molecular axis with respect to the light quantization axis and (θe, Фe) the electron emission direction in the molecular frame. Selected MFPADs for a molecule aligned parallel or perpendicular to linearly polarized light, or perpendicular to the propagation axis of circularly polarized light, are presented for dissociative photoionization (DPI) of D2 at two photon excitation energies, hv=19 eV, where direct PI is the only channel opened, and hv=32.5 eV, i.e. in the region involving resonant excitation of Q1 and Q2 doubly excited state series. We discuss in particular the properties of the circular dichroism characterizing photoemission in the molecular frame for direct and resonant PI. In the latter case, a remarkable behavior is observed which may be attributed to the interference occurring between undistinguishable autoionization decay channels.展开更多
A molecular vector-type descriptor containing 6 variables is used to describe the structure of aromatic hydrocarbons (AHs) and relate to normal boiling points (bp) of AHs. The col relation coefficient (R) between the ...A molecular vector-type descriptor containing 6 variables is used to describe the structure of aromatic hydrocarbons (AHs) and relate to normal boiling points (bp) of AHs. The col relation coefficient (R) between the estimated bp and experimental bp is 0.9988 and the root mean square error (RMS) is 7.907 degreesC for 66 AHs. The RMS obtained by cross-validation is 9.131 degreesC, which implies the relationship model having good prediction ability.展开更多
We present multi-threading and SIMD optimizations on short-range potential calculation kernel in Molecular Dynamics.For the multi-threading optimization,we design a partition-and-two-steps(PTS)method to avoid write co...We present multi-threading and SIMD optimizations on short-range potential calculation kernel in Molecular Dynamics.For the multi-threading optimization,we design a partition-and-two-steps(PTS)method to avoid write conflicts caused by using Newton’s third law.Our method eliminates serialization bottle-neck without extra memory.We implement our PTS method using OpenMP.Afterwards,we discuss the influence of the cutoff if statement on the performance of vectorization in MD simulations.We propose a pre-searching neighbors method,which makes about 70%atoms meet the cutoff check,reducing a large amount of redundant calculation.The experiment results prove our PTS method is scalable and efficient.In double precision,our 256-bit SIMD implementation is about 3×faster than the scalar version.展开更多
Based on the location of bromine substituents and conjugation matrix, a new substituent po- sition index ~X not only was defined, but also molecular shape indexes Km and electronega- tivity distance vectors Mm of diph...Based on the location of bromine substituents and conjugation matrix, a new substituent po- sition index ~X not only was defined, but also molecular shape indexes Km and electronega- tivity distance vectors Mm of diphenylamine and 209 kinds of polybrominated diphenylamine (PBDPA) molecules were calculated. Then the quantitative structure-property relationships (QSPR) among the thermodynamic properties of 210 organic pollutants and 0X, K3, M29, M36 were founded by Leaps-and-Bounds regression. Using the four structural parameters as input neurons of the artificial neural network, three satisfactory QSPR models with network structures of 4:21:1, 4:24:1, and 4:24:1 respectively, were achieved by the back-propagation algorithm. The total correlation coefficients R were 0.9999, 0.9997, and 0.9995 respectively and the standard errors S were 1.036, 1.469, and 1.510 respectively. The relative mean deviation between the predicted value and the experimental value of Sθ, AfHe and △fGθ- were 0.11%, 0.34% and 0.24% respectively, which indicated that the QSPR models had good stability and superior predictive ability. The results showed that there were good nonlinear correlations between the thermodynamic properties of PBDPAs and the four structural pa- rameters. Thus, it was concluded that the ANN models established by the new substituent position index were fully applicable to predict properties of PBDPAs.展开更多
文摘The computational approaches of support vector machine (SVM), support vector regression (SVR) and molecular docking were widely utilized for the computation of active compounds. In this work, to improve the accuracy and reliability of prediction, the strategy of combining the above three computational approaches was applied to predict potential cytochrome P450 1A2 (CYP1A2) inhibitors. The accuracy of the optimal SVM qualitative model was 99.432%, 97.727%, and 91.667% for training set, internal test set and external test set, respectively, showing this model had high discrimination ability. The R2 and mean square error for the optimal SVR quantitative model were 0.763, 0.013 for training set, and 0.753, 0.056 for test set respectively, indicating that this SVR model has high predictive ability for the biolog-ical activities of compounds. According to the results of the SVM and SVR models, some types of descriptors were identi ed to be essential to bioactivity prediction of compounds, including the connectivity indices, constitutional descriptors and functional group counts. Moreover, molecular docking studies were used to reveal the binding poses and binding a n-ity of potential inhibitors interacting with CYP1A2. Wherein, the amino acids of THR124 and ASP320 could form key hydrogen bond interactions with active compounds. And the amino acids of ALA317 and GLY316 could form strong hydrophobic bond interactions with active compounds. The models obtained above were applied to discover potential CYP1A2 inhibitors from natural products, which could predict the CYPs-mediated drug-drug inter-actions and provide useful guidance and reference for rational drug combination therapy. A set of 20 potential CYP1A2 inhibitors were obtained. Part of the results was consistent with references, which further indicates the accuracy of these models and the reliability of this combinatorial computation strategy.
基金supported by the Youth Foundation of Education Bureau,Sichuan Province (09ZB036)Technology Bureau,Sichuan Province (2006j13-141)
文摘Atoms in most organic molecules are often carbon,oxygen,nitrogen,sulfur,halogens,etc. Based on the three-dimensional structure of a molecule,a molecular structural characterization(MSC) method called improved molecular electronegativity-distance vector(I-MEDV) was developed. It was used to describe the structures of 37 compounds of styrax japonicus sieb flowers. Through multiple linear regression(MLR),a QSRR model was built up. The correlation coefficient(R1) of the model was 0.980. Then,4 vectors were selected to build another model through the method of stepwise multiple regression(SMR) ,and the correlation coefficient(R2) of the model was 0.975. Moreover,all the two models were evaluated by performing the crossvalidation with the leave-one-out(LOO) procedure and the correlation coefficients(Rcv) were 0.948 and 0.968,respectively. The results show that the I-MEDV could successfully describe the structures of organic compounds. The stability and predictability of the models were good.
基金supported by the Foundation of Returned Scholars (Main Program) of Shanxi Province (200902)
文摘Polychlorinated dibenzothiophenes(PCDTs) are classified as persistent organic pollutants in the environment,so the analysis of PCDTs by their gas chromatographic behaviors is of great significance.Quantitative structure-retention relationship(QSRR) analysis is a useful technique capable of relating chromatographic retention time to the molecular structure.In this paper,a QSRR study of 37 PCDTs was carried out by using molecular electronegativity distance vector(MEDV) descriptors and multiple linear regression(MLR) and partial least-squares regression(PLS) methods.The correlation coefficient R of established MLR,PLS models,leave-one-out(LOO) cross-validation(CV),Q2ext were 0.9951,0.9942,0.9839(MLR) and 0.9925,0.9915,0.9833(PLS),respectively.Results showed that the model exhibited excellent estimate capability for internal sample set and good predictive capability for external sample set.By using MEDV descriptors,the QSRR model can provide a simple and rapid way to predict the gas-chromatographic retention indices of polychlorinated dibenzothiophenes in conditions of lacking standard samples or poor experimental conditions.
基金Funded by Chongqing Medical University Scientific Research Foundation
文摘A set of novel structural descriptors (molecular hybridization electronegativity-distance vector, VMEDh) was put forward, and the quantitative structure–activity relationship (QSAR) of a series of 17α-Acetoxyprogesterones (APs) was investigated. Taking into account the effect of various hybridized orbits on atomic electronegativities, we developed the structure descriptors with amended electronegativities to build a QSAR model. The 10-parameter model based on VMEDh yields a correlation coefficient R=0.972 and standard deviation SD=0.262, which are more desirable than those of the previous molecular electonegativity-distance vector (MEDV-4) (R=0.969, SD=0.275). By stepwise multiple linear regression, several parameters are selected to construct optimal models. The 7-parameter model based on VMEDh has R=0.960 and SD=0.276; its correlation coefficient (RCV) and standard deviation (SDCV) for leave-one-out procedure crossvalidation are respectively RCV=0.890 and SDCV=0.445. The 6-parameter MEDV-4 model has R=0.946, SD=0.304, RCV=0.903 and SDCV=0.406. It is demonstrated that VMEDh has desirable estimation performance and good predictive capability for this series of chemical compounds.
基金the National Natural Science Foundation of China for financial support(No.81373335)
文摘A cationic gene delivery vector, guanidinylated disulfide-containing poly(amido amine)(CARCBA), was synthesized by Michael addition reaction between N,N′-cystaminebisacrylamide(CBA) and guanidine hydrochloride(CAR). Gel permeation chromatography(GPC) was used to evaluate the molecular weight of synthesized CAR-CBA. Polyethyleneimine(PEI) with molecular weight of 25 kDa was adopted as a reference, and polyethylene glycols(PEG) with different molecular weights were used to establish a standard curve for determining the molecular weight of CAR-CBA. The effects of two critical factors, namely columns and eluents,on the molecular weight measurement of CAR-CBA were investigated to optimize the GPC quantitative method. The results showed that Ultrahydrogel columns(120, 250) and HAc–NaAc(0.5 M, pH 4.5) buffer solution were the optimal column and GPC eluent, respectively.The molecular weight of the synthesized CAR-CBA was analyzed by the optimized GPC method and determined to be 24.66 kDa.
文摘Molecular frame photoemission is a very sensitive probe of the photoionization (PI) dynamics of molecules. This paper reports a comparative study of non-resonant and resonant photoionization of D2 induced by VUV circularly polarized synchrotron radiation at SOLEIL at the level of the molecular frame photoelectron angular distributions (MFPADs). We use the vector correlation method which combines imaging and time-of-flight resolved electron-ion coincidence techniques, and a generalized formalism for the expression of the Ⅰ(χ, θe, Фe) MFPADs, where χ is the orientation of the molecular axis with respect to the light quantization axis and (θe, Фe) the electron emission direction in the molecular frame. Selected MFPADs for a molecule aligned parallel or perpendicular to linearly polarized light, or perpendicular to the propagation axis of circularly polarized light, are presented for dissociative photoionization (DPI) of D2 at two photon excitation energies, hv=19 eV, where direct PI is the only channel opened, and hv=32.5 eV, i.e. in the region involving resonant excitation of Q1 and Q2 doubly excited state series. We discuss in particular the properties of the circular dichroism characterizing photoemission in the molecular frame for direct and resonant PI. In the latter case, a remarkable behavior is observed which may be attributed to the interference occurring between undistinguishable autoionization decay channels.
文摘A molecular vector-type descriptor containing 6 variables is used to describe the structure of aromatic hydrocarbons (AHs) and relate to normal boiling points (bp) of AHs. The col relation coefficient (R) between the estimated bp and experimental bp is 0.9988 and the root mean square error (RMS) is 7.907 degreesC for 66 AHs. The RMS obtained by cross-validation is 9.131 degreesC, which implies the relationship model having good prediction ability.
文摘We present multi-threading and SIMD optimizations on short-range potential calculation kernel in Molecular Dynamics.For the multi-threading optimization,we design a partition-and-two-steps(PTS)method to avoid write conflicts caused by using Newton’s third law.Our method eliminates serialization bottle-neck without extra memory.We implement our PTS method using OpenMP.Afterwards,we discuss the influence of the cutoff if statement on the performance of vectorization in MD simulations.We propose a pre-searching neighbors method,which makes about 70%atoms meet the cutoff check,reducing a large amount of redundant calculation.The experiment results prove our PTS method is scalable and efficient.In double precision,our 256-bit SIMD implementation is about 3×faster than the scalar version.
文摘Based on the location of bromine substituents and conjugation matrix, a new substituent po- sition index ~X not only was defined, but also molecular shape indexes Km and electronega- tivity distance vectors Mm of diphenylamine and 209 kinds of polybrominated diphenylamine (PBDPA) molecules were calculated. Then the quantitative structure-property relationships (QSPR) among the thermodynamic properties of 210 organic pollutants and 0X, K3, M29, M36 were founded by Leaps-and-Bounds regression. Using the four structural parameters as input neurons of the artificial neural network, three satisfactory QSPR models with network structures of 4:21:1, 4:24:1, and 4:24:1 respectively, were achieved by the back-propagation algorithm. The total correlation coefficients R were 0.9999, 0.9997, and 0.9995 respectively and the standard errors S were 1.036, 1.469, and 1.510 respectively. The relative mean deviation between the predicted value and the experimental value of Sθ, AfHe and △fGθ- were 0.11%, 0.34% and 0.24% respectively, which indicated that the QSPR models had good stability and superior predictive ability. The results showed that there were good nonlinear correlations between the thermodynamic properties of PBDPAs and the four structural pa- rameters. Thus, it was concluded that the ANN models established by the new substituent position index were fully applicable to predict properties of PBDPAs.