期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Accurate machine learning models based on small dataset of energetic materials through spatial matrix featurization methods 被引量:6
1
作者 Chao Chen Danyang Liu +4 位作者 Siyan Deng Lixiang Zhong Serene Hay Yee Chan Shuzhou Li Huey Hoon Hng 《Journal of Energy Chemistry》 SCIE EI CAS CSCD 2021年第12期364-375,I0009,共13页
A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the develo... A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the development of proper featurization method based on physicochemical nature of target proprieties can improve the predictive power of ML models with a smaller database.In this work,we show that two new featurization methods,volume occupation spatial matrix and heat contribution spatial matrix,can improve the accuracy in predicting energetic materials' crystal density(ρ_(crystal)) and solid phase enthalpy of formation(H_(f,solid)) using a database containing 451 energetic molecules.Their mean absolute errors are reduced from 0.048 g/cm~3 and 24.67 kcal/mol to 0.035 g/cm~3 and 9.66 kcal/mol,respectively.By leave-one-out-cross-validation,the newly developed ML models can be used to determine the performance of most kinds of energetic materials except cubanes.Our ML models are applied to predict ρ_(crystal) and H_(f,solid) of CHON-based molecules of the 150 million sized PubChem database,and screened out 56 candidates with competitive detonation performance and reasonable chemical structures.With further improvement in future,spatial matrices have the potential of becoming multifunctional ML simulation tools that could provide even better predictions in wider fields of materials science. 展开更多
关键词 Small database machine learning Energetic materials screening Spatial matrix featurization method Crystal density Formation enthalpy n-Body interactions
下载PDF
Self-Organizing Genetic Algorithm Based Method for Constructing Bayesian Networks from Databases
2
作者 郑建军 刘玉树 陈立潮 《Journal of Beijing Institute of Technology》 EI CAS 2003年第1期23-27,共5页
The typical characteristic of the topology of Bayesian networks (BNs) is the interdependence among different nodes (variables), which makes it impossible to optimize one variable independently of others, and the learn... The typical characteristic of the topology of Bayesian networks (BNs) is the interdependence among different nodes (variables), which makes it impossible to optimize one variable independently of others, and the learning of BNs structures by general genetic algorithms is liable to converge to local extremum. To resolve efficiently this problem, a self-organizing genetic algorithm (SGA) based method for constructing BNs from databases is presented. This method makes use of a self-organizing mechanism to develop a genetic algorithm that extended the crossover operator from one to two, providing mutual competition between them, even adjusting the numbers of parents in recombination (crossover/recomposition) schemes. With the K2 algorithm, this method also optimizes the genetic operators, and utilizes adequately the domain knowledge. As a result, with this method it is able to find a global optimum of the topology of BNs, avoiding premature convergence to local extremum. The experimental results proved to be and the convergence of the SGA was discussed. 展开更多
关键词 Bayesian networks structure learning from databases self-organizing genetic algorithm
下载PDF
Efficient Model Store and Reuse in an OLML Database System
3
作者 Jian-Wei Cui Wei Lu +1 位作者 Xin Zhao Xiao-Yong Du 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第4期792-805,共14页
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network models.Yet,for these neural network models,it is necessary to label a tremendous amou... Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network models.Yet,for these neural network models,it is necessary to label a tremendous amount of training data,which is prohibitively expensive in reality.In this paper,we propose OnLine Machine Learning(OLML)database which stores trained models and reuses these models in a new training task to achieve a better training effect with a small amount of training data.An efficient model reuse algorithm AdaReuse is developed in the OLML database.Specifically,AdaReuse firstly estimates the reuse potential of trained models from domain relatedness and model quality,through which a group of trained models with high reuse potential for the training task could be selected efficiently.Then,multi selected models will be trained iteratively to encourage diverse models,with which a better training effect could be achieved by ensemble.We evaluate AdaReuse on two types of natural language processing(NLP)tasks,and the results show AdaReuse could improve the training effect significantly compared with models training from scratch when the training data is limited.Based on AdaReuse,we implement an OLML database prototype system which could accept a training task as an SQL-like query and automatically generate a training plan by selecting and reusing trained models.Usability studies are conducted to illustrate the OLML database could properly store the trained models,and reuse the trained models efficiently in new training tasks. 展开更多
关键词 model selection model reuse OnLine Machine learning(OLML)database
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部