The enthalpies of formation of solid organic compounds containing carbon,nitrogen,oxygen,and hydrogen were estimated using two suggested descriptor sets,separately,by machine learning methods.The two descriptor sets a...The enthalpies of formation of solid organic compounds containing carbon,nitrogen,oxygen,and hydrogen were estimated using two suggested descriptor sets,separately,by machine learning methods.The two descriptor sets are both composed of descriptors of Benson’s groups and corrected groups.The main differences between them are that one is generated based on atoms and the other is based on bonds.An in-house program was specially written in Java to extract all the descriptors with a function to ensure that each atom(or bond)of a molecule is represented by Benson’s groups once for an atom-based(or bond-based)descriptor set.Multiple linear regression and partial least squares were used,separately,to build models to predict the enthalpy of formation for two descriptor sets.The combination of the models constructed by two descriptor sets based on the atoms and the bonds achieved the best-predicted results in this paper,and the corresponding results of the test set are better than that in the literature,from which the original data were retrieved.Further,a small data set of fluorinated molecules was collected,and satisfactory results were also obtained for these molecules containing fluorine with the assistance of the former data set.展开更多
基金Open Research Fund Program of Science and Technology on Aerospace Chemical Power Laboratory,China(No.120201B01)National Natural Science Foundation of China(Nos.21875061,21975066).
文摘The enthalpies of formation of solid organic compounds containing carbon,nitrogen,oxygen,and hydrogen were estimated using two suggested descriptor sets,separately,by machine learning methods.The two descriptor sets are both composed of descriptors of Benson’s groups and corrected groups.The main differences between them are that one is generated based on atoms and the other is based on bonds.An in-house program was specially written in Java to extract all the descriptors with a function to ensure that each atom(or bond)of a molecule is represented by Benson’s groups once for an atom-based(or bond-based)descriptor set.Multiple linear regression and partial least squares were used,separately,to build models to predict the enthalpy of formation for two descriptor sets.The combination of the models constructed by two descriptor sets based on the atoms and the bonds achieved the best-predicted results in this paper,and the corresponding results of the test set are better than that in the literature,from which the original data were retrieved.Further,a small data set of fluorinated molecules was collected,and satisfactory results were also obtained for these molecules containing fluorine with the assistance of the former data set.