Direct coal liquefaction(DCL)is an important and effective method of converting coal into high-valueadded chemicals and fuel oil.In DCL,heating the direct coal liquefaction solvent(DCLS)from low to high temperature an...Direct coal liquefaction(DCL)is an important and effective method of converting coal into high-valueadded chemicals and fuel oil.In DCL,heating the direct coal liquefaction solvent(DCLS)from low to high temperature and pre-hydrogenation of the DCLS are critical steps.Therefore,studying the dissolution of hydrogen in DCLS under liquefaction conditions gains importance.However,it is difficult to precisely determine hydrogen solubility only by experiments,especially under the actual DCL conditions.To address this issue,we developed a prediction model of hydrogen solubility in a single solvent based on the machine-learning quantitative structure–property relationship(ML-QSPR)methods.The results showed that the squared correlation coefficient R^(2)=0.92 and root mean square error RMSE=0.095,indicating the model’s good statistical performance.The external validation of the model also reveals excellent accuracy and predictive ability.Molecular polarization(a)is the main factor affecting the dissolution of hydrogen in DCLS.The hydrogen solubility in acyclic alkanes increases with increasing carbon number.Whereas in polycyclic aromatics,it decreases with increasing ring number,and in hydrogenated aromatics,it increases with hydrogenation degree.This work provides a new reference for the selection and proportioning of DCLS,i.e.,a solvent with higher hydrogen solubility can be added to provide active hydrogen for the reaction and thus reduce the hydrogen pressure.Besides,it brings important insight into the theoretical significance and practical value of the DCL.展开更多
A novel quantitative structure-property relationship (QSPR) model for estimating the solution surface tension of 92 organic compounds at 20℃ was developed based on newly introduced atom-type topological indices. Th...A novel quantitative structure-property relationship (QSPR) model for estimating the solution surface tension of 92 organic compounds at 20℃ was developed based on newly introduced atom-type topological indices. The data set contained non-polar and polar liquids, and saturated and unsaturated compounds. The regression analysis shows that excellent result is obtained with multiple linear regression. The predictive power of the proposed model was discussed using the leave-one-out (LOO) cross-validated (CV) method. The correlation coefficient (R) and the leave-one-out cross-validation correlation coefficient (Rcv) of multiple linear regression model are 0.991 4 and 0.991 3, respectively. The new model gives the average absolute relative deviation of 1.81% for 92 substances. The result demonstrates that novel topological indices based on the equilibrium electro-negativity of atom and the relative bond length are useful model parameters for QSPR analysis of compounds.展开更多
A quantitative structure-property relationship (QSPR) study has been made for the prediction of the surface tension of nonionic surfactants in aqueous solution. The regressed model includes a topological descriptor, ...A quantitative structure-property relationship (QSPR) study has been made for the prediction of the surface tension of nonionic surfactants in aqueous solution. The regressed model includes a topological descriptor, the Kier & Hall index of zero order (KH0) of the hydrophobic segment of surfactant and a quantum chemical one, the heat of formation (fHD) of surfactant molecules. The established general QSPR between the surface tension and the descriptors produces a correlation coefficient of multiple determination, 2r=0.9877, for 30 studied nonionic surfactants.展开更多
Twenty eight alkyl(1-phenylsulfonyl) cycloalkane carboxylates were computed at the B3LYP/6-31G* level. Based on linear solvation energy theory, two quantitative correlation equations of the molecular structures of alk...Twenty eight alkyl(1-phenylsulfonyl) cycloalkane carboxylates were computed at the B3LYP/6-31G* level. Based on linear solvation energy theory, two quantitative correlation equations of the molecular structures of alkyl(1-phenylsulfonyl) cycloalkane carboxylate com- pounds to their chromatographic retention (capacity factor lgKW) and the toxicity for photo- bacterium phosphoreum (–lgEC50) were developed by using the molecular structural parameters as theoretical descriptors (r2 = 0.9501, 0.9488). The two quantitative correlation equations were consequently cross validated by leave-one-out (LOO) validation method with q2 of 0.9113 and 0.9281, respectively. The result showed that the two equations achieved in this work by B3LYP/6-31G* are both more advantageous than those from AM1, and can be used to predict the lgKW and –lgEC50 of congeneric organics.展开更多
The physicochemical properties of liquid alternative fuels are important but difficult to measure/predict, especially when complex surrogate fuels are concerned. In the present work, machine learning is used to develo...The physicochemical properties of liquid alternative fuels are important but difficult to measure/predict, especially when complex surrogate fuels are concerned. In the present work, machine learning is used to develop quantitative structure–property relationship models. The fuel chemical structure is represented by molecular descriptors, allowing the linking of important features of the fuel composition and key properties of fuel utilization. Feature selection is employed to select the most relevant features that describe the chemical structure of the fuel and several machine learning algorithms are tested to construct interpretable models. The effectiveness of the methodology is demonstrated through the development of accurate and interpretable predictive models for cetane numbers, with a focus on understanding the link between molecular structure and fuel properties. In this context, matrix-based descriptors and descriptors related to the number of atoms in the molecule are directly linked with the cetane number of hydrocarbons. Furthermore, the results showed that molecular connectivity indices play a role in the cetane number for aromatic molecules. Also, the methodology is extended to predict the cetane number of ester and ether molecules, leveraging the design of alternative fuels towards fully sustainable fuel utilization.展开更多
Based on the quantum chemical descriptors,quantitative structure-property relationship(QSPR) models have been developed to estimate and predict the photodegradation rate constant(logK) of polycyclic aromatic hydro...Based on the quantum chemical descriptors,quantitative structure-property relationship(QSPR) models have been developed to estimate and predict the photodegradation rate constant(logK) of polycyclic aromatic hydrocarbons(PAHs) by use of linear method(multiple linear regression,MLR) and non-linear method(back propagation artificial neural network,BP-ANN).A BP-ANN with 3-3-1 architecture was generated by using three quantum chemical descriptors appearing in the MLR model.The standard heat of formation(HOF),the gap of frontier molecular orbital energies(ΔELH) and total energy(TE) were inputs and its output was logK.Leave-One-Out(LOO) Cross-Validated correlation coefficient(R^2CV) of the established MLR and BP-ANN models were 0.6383 and 0.7843,respectively.The nonlinear BP-ANN model has better predictive ability compared to the linear MLR model with the root mean square error(RMSE) for training and validation sets to be 0.1071,0.1514 and the squared correlation coefficient(R^2) of 0.9791,0.9897,respectively.In addition,some insights into the molecular structural features affecting the photodegradation of PAHs were also discussed.展开更多
Based on two-dimensional topological structures, a novel molecular electronegativity interaction vector with hybridization (MEHIV) was developed to describe atomic hybridization state in different molecular environm...Based on two-dimensional topological structures, a novel molecular electronegativity interaction vector with hybridization (MEHIV) was developed to describe atomic hybridization state in different molecular environments. Five quantitative models by MEHIV characterization and multiple linear regression modeling were successfully established to predict reduced ion mobility constants (Ko) of alkanes, aromatic hydrocarbons, fatty alcohols, fatty aldehydes and ketones and carboxylic esters. The correlation coefficients Roy by leave-one-out cross-validation are 0.792, 0.787, 0,949, 0.972 and 0.981, respectively, and the standard deviations SDcv are 0.067, 0.086, 0.064, 0.043 and 0.042, respectively. These results suggested that MEHIV is an excellent topological index descriptor with many advantages such as straightforward physicochemical meaning, high characterization competence, convenient expansibility and easy manipulation.展开更多
以物质的电子、空间等结构性质为基础,运用Gaussian98和Cerius2程序包对偶极距(Dipole)、最高占据轨道能量(EHOMO)、最低空轨道能量(ELUMO)、分子总能量(E)、旋转键(Rotlbonds)、最弱的R-NO2键长(R-NO2 bond length,R为C或N)、氢键供体(...以物质的电子、空间等结构性质为基础,运用Gaussian98和Cerius2程序包对偶极距(Dipole)、最高占据轨道能量(EHOMO)、最低空轨道能量(ELUMO)、分子总能量(E)、旋转键(Rotlbonds)、最弱的R-NO2键长(R-NO2 bond length,R为C或N)、氢键供体(Hbond donor)和中点势(Vmid)8种描述符进行了计算,采用Cerius2程序包中的QSPR方法建立了芳香系炸药密度与8种描述符之间的构效关系式,相关系数R为0.909,30个化合物所构成的训练集和15个化合物所构成的预测集预测密度与实测密度之间的平均误差分别为3.33%和2.94%。展开更多
In this paper, according to the peak numbers of the nuclear magnetic resonance and the Randic embranchment degree (δ_i) of carbon atom i, the carbon atom’s environment valence g_i is defined as: g_i=(t_i+δ_i)/2.The...In this paper, according to the peak numbers of the nuclear magnetic resonance and the Randic embranchment degree (δ_i) of carbon atom i, the carbon atom’s environment valence g_i is defined as: g_i=(t_i+δ_i)/2.The g_i reflect the characteristic of each carbon atom, and as well as the conjunction detail of the carbon atom with other carbon atoms.So, the g_ i could distinguish better the chemical environment of each carbon atom in the molecule than δ_i.A connectivity index of environment valence ( mS) and its athwart index ( mS′) are proposed based on the adjacency matrix and the carbon atom’s environment valence g_i.Among them, the 0S and 0S′ include the characteristic and the connectivity of each carbon atom, the 1S and 1S′ reflect the second conjunction between carbon atoms.Based on 0S′ and N(the number of carbon atom), a new structural parameter——symmetry degree (N_ ec), is defined as: N_ ec=[(0S′_S/0S′_C)N] 2/3,and the N_ ec reflect the size of the molecule as well as the symmetry of the molecule.The N_ ec, 0S and R_n(the biggest ring’s edge numbers of cycloalkanes) of 474 saturated hydrocarbons (216 paraffins and 258 cycloalkanes) were calculated and correlated with their boiling points.The best regression equation was obtained as follow: ln(1056-T_b)=6.9480-0.1040N_ ec -0.0086890S-0.009614R_ n+0.01998R 0.5_n,n=474,R=0.9989,F=52627,S=5.63K.The model was checked up by the Jackknife’s method.It should have overall steadiness and could be used for predicting the boiling point of saturated hydrocarbons.展开更多
基金the financial support from the National Key Research and Development Program of China(2022YFB4101302-01)the National Natural Science Foundation of China(22178243)the science and technology innovation project of China Shenhua Coal to Liquid and Chemical Company Limited(MZYHG-22-02).
文摘Direct coal liquefaction(DCL)is an important and effective method of converting coal into high-valueadded chemicals and fuel oil.In DCL,heating the direct coal liquefaction solvent(DCLS)from low to high temperature and pre-hydrogenation of the DCLS are critical steps.Therefore,studying the dissolution of hydrogen in DCLS under liquefaction conditions gains importance.However,it is difficult to precisely determine hydrogen solubility only by experiments,especially under the actual DCL conditions.To address this issue,we developed a prediction model of hydrogen solubility in a single solvent based on the machine-learning quantitative structure–property relationship(ML-QSPR)methods.The results showed that the squared correlation coefficient R^(2)=0.92 and root mean square error RMSE=0.095,indicating the model’s good statistical performance.The external validation of the model also reveals excellent accuracy and predictive ability.Molecular polarization(a)is the main factor affecting the dissolution of hydrogen in DCLS.The hydrogen solubility in acyclic alkanes increases with increasing carbon number.Whereas in polycyclic aromatics,it decreases with increasing ring number,and in hydrogenated aromatics,it increases with hydrogenation degree.This work provides a new reference for the selection and proportioning of DCLS,i.e.,a solvent with higher hydrogen solubility can be added to provide active hydrogen for the reaction and thus reduce the hydrogen pressure.Besides,it brings important insight into the theoretical significance and practical value of the DCL.
基金Projects(20775010,21075011) supported by the National Natural Science Foundation of ChinaProject(2008AA05Z405) supported by the National High Technology Research and Development Program of China+2 种基金Project(09JJ3016) supported by Hunan Provincial Natural Science Foundation,ChinaProject(09C066) supported by Scientific Research Fund of Hunan Provincial Education Department,ChinaProject(2010CL01) supported by the Foundation of Hunan Provincial Key Laboratory of Materials Protection for Electric Power and Transportation,China
文摘A novel quantitative structure-property relationship (QSPR) model for estimating the solution surface tension of 92 organic compounds at 20℃ was developed based on newly introduced atom-type topological indices. The data set contained non-polar and polar liquids, and saturated and unsaturated compounds. The regression analysis shows that excellent result is obtained with multiple linear regression. The predictive power of the proposed model was discussed using the leave-one-out (LOO) cross-validated (CV) method. The correlation coefficient (R) and the leave-one-out cross-validation correlation coefficient (Rcv) of multiple linear regression model are 0.991 4 and 0.991 3, respectively. The new model gives the average absolute relative deviation of 1.81% for 92 substances. The result demonstrates that novel topological indices based on the equilibrium electro-negativity of atom and the relative bond length are useful model parameters for QSPR analysis of compounds.
基金the National Natural Science Foundation of China(to grant No.29903006 and 29973023)the Visiting Scholar Foundation of Key Laboratory in University of China for their financial support
文摘A quantitative structure-property relationship (QSPR) study has been made for the prediction of the surface tension of nonionic surfactants in aqueous solution. The regressed model includes a topological descriptor, the Kier & Hall index of zero order (KH0) of the hydrophobic segment of surfactant and a quantum chemical one, the heat of formation (fHD) of surfactant molecules. The established general QSPR between the surface tension and the descriptors produces a correlation coefficient of multiple determination, 2r=0.9877, for 30 studied nonionic surfactants.
基金This work was financially supported by the National Basic Research Program of China (2003CB415002), the China Postdoctoral Science Foundation (No. 2003033486) and the Natural Science Research Fund of University in Jiangsu (04KJB150149)
文摘Twenty eight alkyl(1-phenylsulfonyl) cycloalkane carboxylates were computed at the B3LYP/6-31G* level. Based on linear solvation energy theory, two quantitative correlation equations of the molecular structures of alkyl(1-phenylsulfonyl) cycloalkane carboxylate com- pounds to their chromatographic retention (capacity factor lgKW) and the toxicity for photo- bacterium phosphoreum (–lgEC50) were developed by using the molecular structural parameters as theoretical descriptors (r2 = 0.9501, 0.9488). The two quantitative correlation equations were consequently cross validated by leave-one-out (LOO) validation method with q2 of 0.9113 and 0.9281, respectively. The result showed that the two equations achieved in this work by B3LYP/6-31G* are both more advantageous than those from AM1, and can be used to predict the lgKW and –lgEC50 of congeneric organics.
基金supported by the UK Physical Sciences Research Council under Grant No.EP/X019551/1.
文摘The physicochemical properties of liquid alternative fuels are important but difficult to measure/predict, especially when complex surrogate fuels are concerned. In the present work, machine learning is used to develop quantitative structure–property relationship models. The fuel chemical structure is represented by molecular descriptors, allowing the linking of important features of the fuel composition and key properties of fuel utilization. Feature selection is employed to select the most relevant features that describe the chemical structure of the fuel and several machine learning algorithms are tested to construct interpretable models. The effectiveness of the methodology is demonstrated through the development of accurate and interpretable predictive models for cetane numbers, with a focus on understanding the link between molecular structure and fuel properties. In this context, matrix-based descriptors and descriptors related to the number of atoms in the molecule are directly linked with the cetane number of hydrocarbons. Furthermore, the results showed that molecular connectivity indices play a role in the cetane number for aromatic molecules. Also, the methodology is extended to predict the cetane number of ester and ether molecules, leveraging the design of alternative fuels towards fully sustainable fuel utilization.
基金supported by the Natural Science Foundation of Fujian Province (D0710019)the Natural Science Foundation of Overseas Chinese Affairs Office of the State Council (06QZR09)
文摘Based on the quantum chemical descriptors,quantitative structure-property relationship(QSPR) models have been developed to estimate and predict the photodegradation rate constant(logK) of polycyclic aromatic hydrocarbons(PAHs) by use of linear method(multiple linear regression,MLR) and non-linear method(back propagation artificial neural network,BP-ANN).A BP-ANN with 3-3-1 architecture was generated by using three quantum chemical descriptors appearing in the MLR model.The standard heat of formation(HOF),the gap of frontier molecular orbital energies(ΔELH) and total energy(TE) were inputs and its output was logK.Leave-One-Out(LOO) Cross-Validated correlation coefficient(R^2CV) of the established MLR and BP-ANN models were 0.6383 and 0.7843,respectively.The nonlinear BP-ANN model has better predictive ability compared to the linear MLR model with the root mean square error(RMSE) for training and validation sets to be 0.1071,0.1514 and the squared correlation coefficient(R^2) of 0.9791,0.9897,respectively.In addition,some insights into the molecular structural features affecting the photodegradation of PAHs were also discussed.
基金the State Key Laboratory of Chemo/Biosensing and Chemometrics Foundation(No.05-12-1)
文摘Based on two-dimensional topological structures, a novel molecular electronegativity interaction vector with hybridization (MEHIV) was developed to describe atomic hybridization state in different molecular environments. Five quantitative models by MEHIV characterization and multiple linear regression modeling were successfully established to predict reduced ion mobility constants (Ko) of alkanes, aromatic hydrocarbons, fatty alcohols, fatty aldehydes and ketones and carboxylic esters. The correlation coefficients Roy by leave-one-out cross-validation are 0.792, 0.787, 0,949, 0.972 and 0.981, respectively, and the standard deviations SDcv are 0.067, 0.086, 0.064, 0.043 and 0.042, respectively. These results suggested that MEHIV is an excellent topological index descriptor with many advantages such as straightforward physicochemical meaning, high characterization competence, convenient expansibility and easy manipulation.
文摘以物质的电子、空间等结构性质为基础,运用Gaussian98和Cerius2程序包对偶极距(Dipole)、最高占据轨道能量(EHOMO)、最低空轨道能量(ELUMO)、分子总能量(E)、旋转键(Rotlbonds)、最弱的R-NO2键长(R-NO2 bond length,R为C或N)、氢键供体(Hbond donor)和中点势(Vmid)8种描述符进行了计算,采用Cerius2程序包中的QSPR方法建立了芳香系炸药密度与8种描述符之间的构效关系式,相关系数R为0.909,30个化合物所构成的训练集和15个化合物所构成的预测集预测密度与实测密度之间的平均误差分别为3.33%和2.94%。
文摘In this paper, according to the peak numbers of the nuclear magnetic resonance and the Randic embranchment degree (δ_i) of carbon atom i, the carbon atom’s environment valence g_i is defined as: g_i=(t_i+δ_i)/2.The g_i reflect the characteristic of each carbon atom, and as well as the conjunction detail of the carbon atom with other carbon atoms.So, the g_ i could distinguish better the chemical environment of each carbon atom in the molecule than δ_i.A connectivity index of environment valence ( mS) and its athwart index ( mS′) are proposed based on the adjacency matrix and the carbon atom’s environment valence g_i.Among them, the 0S and 0S′ include the characteristic and the connectivity of each carbon atom, the 1S and 1S′ reflect the second conjunction between carbon atoms.Based on 0S′ and N(the number of carbon atom), a new structural parameter——symmetry degree (N_ ec), is defined as: N_ ec=[(0S′_S/0S′_C)N] 2/3,and the N_ ec reflect the size of the molecule as well as the symmetry of the molecule.The N_ ec, 0S and R_n(the biggest ring’s edge numbers of cycloalkanes) of 474 saturated hydrocarbons (216 paraffins and 258 cycloalkanes) were calculated and correlated with their boiling points.The best regression equation was obtained as follow: ln(1056-T_b)=6.9480-0.1040N_ ec -0.0086890S-0.009614R_ n+0.01998R 0.5_n,n=474,R=0.9989,F=52627,S=5.63K.The model was checked up by the Jackknife’s method.It should have overall steadiness and could be used for predicting the boiling point of saturated hydrocarbons.