Based on two-dimensional topological structures, a novel molecular electronegativity interaction vector with hybridization (MEHIV) was developed to describe atomic hybridization state in different molecular environm...Based on two-dimensional topological structures, a novel molecular electronegativity interaction vector with hybridization (MEHIV) was developed to describe atomic hybridization state in different molecular environments. Five quantitative models by MEHIV characterization and multiple linear regression modeling were successfully established to predict reduced ion mobility constants (Ko) of alkanes, aromatic hydrocarbons, fatty alcohols, fatty aldehydes and ketones and carboxylic esters. The correlation coefficients Roy by leave-one-out cross-validation are 0.792, 0.787, 0,949, 0.972 and 0.981, respectively, and the standard deviations SDcv are 0.067, 0.086, 0.064, 0.043 and 0.042, respectively. These results suggested that MEHIV is an excellent topological index descriptor with many advantages such as straightforward physicochemical meaning, high characterization competence, convenient expansibility and easy manipulation.展开更多
As the definitions of 36 atomic fragment types in organic compounds,multi-order atom-pair frequency matrix was constructed in terms of atomic fragments occurring in pair at different bond distances,and based on a new ...As the definitions of 36 atomic fragment types in organic compounds,multi-order atom-pair frequency matrix was constructed in terms of atomic fragments occurring in pair at different bond distances,and based on a new molecular coding technique as characteristic atom-pair hologram code(CAHC)proposed in this paper.Collected from reference reports,a large-scale ion mobility spectrometry collision cross section database comprising 819 samples was established and quantitative structure-spectrometry relationship(QSSR)studies were performed with the CAHC.Testing modeling stabilities and generalization abilities by both internal and external examinations confirmed that CAHC was in obvious linear relationship with peptide collision cross sections,while it was involved in partially nonlinear factors for a few polypeptides.The model was deemed to assist in quantitative computer-aided predictions for peptide collision cross sections.展开更多
In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) ...In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) strategy, is used as the descriptor selection and model development method. Then, the support vector machine (SVM) and multiple linear regression (MLR) model are utilized to construct the non-linear and linear quantitative structure-property relationship models. The results obtained using the SVM model are compared with those obtained using MLR reveal that the SVM model is of much better predictive value than the MLR one. The root-mean-square errors for the training set and the test set for the SVM model were 0.1911 and 0.2569, respectively, while by the MLR model, they were 0.4908 and 0.6494, respectively. The results show that the SVM model drastically enhances the ability of prediction in QSPR studies and is superior to the MLR model.展开更多
基金the State Key Laboratory of Chemo/Biosensing and Chemometrics Foundation(No.05-12-1)
文摘Based on two-dimensional topological structures, a novel molecular electronegativity interaction vector with hybridization (MEHIV) was developed to describe atomic hybridization state in different molecular environments. Five quantitative models by MEHIV characterization and multiple linear regression modeling were successfully established to predict reduced ion mobility constants (Ko) of alkanes, aromatic hydrocarbons, fatty alcohols, fatty aldehydes and ketones and carboxylic esters. The correlation coefficients Roy by leave-one-out cross-validation are 0.792, 0.787, 0,949, 0.972 and 0.981, respectively, and the standard deviations SDcv are 0.067, 0.086, 0.064, 0.043 and 0.042, respectively. These results suggested that MEHIV is an excellent topological index descriptor with many advantages such as straightforward physicochemical meaning, high characterization competence, convenient expansibility and easy manipulation.
文摘As the definitions of 36 atomic fragment types in organic compounds,multi-order atom-pair frequency matrix was constructed in terms of atomic fragments occurring in pair at different bond distances,and based on a new molecular coding technique as characteristic atom-pair hologram code(CAHC)proposed in this paper.Collected from reference reports,a large-scale ion mobility spectrometry collision cross section database comprising 819 samples was established and quantitative structure-spectrometry relationship(QSSR)studies were performed with the CAHC.Testing modeling stabilities and generalization abilities by both internal and external examinations confirmed that CAHC was in obvious linear relationship with peptide collision cross sections,while it was involved in partially nonlinear factors for a few polypeptides.The model was deemed to assist in quantitative computer-aided predictions for peptide collision cross sections.
文摘In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) strategy, is used as the descriptor selection and model development method. Then, the support vector machine (SVM) and multiple linear regression (MLR) model are utilized to construct the non-linear and linear quantitative structure-property relationship models. The results obtained using the SVM model are compared with those obtained using MLR reveal that the SVM model is of much better predictive value than the MLR one. The root-mean-square errors for the training set and the test set for the SVM model were 0.1911 and 0.2569, respectively, while by the MLR model, they were 0.4908 and 0.6494, respectively. The results show that the SVM model drastically enhances the ability of prediction in QSPR studies and is superior to the MLR model.