期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Accurate machine learning models based on small dataset of energetic materials through spatial matrix featurization methods 被引量:6
1
作者 Chao Chen Danyang Liu +4 位作者 Siyan Deng Lixiang Zhong Serene Hay Yee Chan Shuzhou Li Huey Hoon Hng 《Journal of Energy Chemistry》 SCIE EI CAS CSCD 2021年第12期364-375,I0009,共13页
A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the develo... A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the development of proper featurization method based on physicochemical nature of target proprieties can improve the predictive power of ML models with a smaller database.In this work,we show that two new featurization methods,volume occupation spatial matrix and heat contribution spatial matrix,can improve the accuracy in predicting energetic materials' crystal density(ρ_(crystal)) and solid phase enthalpy of formation(H_(f,solid)) using a database containing 451 energetic molecules.Their mean absolute errors are reduced from 0.048 g/cm~3 and 24.67 kcal/mol to 0.035 g/cm~3 and 9.66 kcal/mol,respectively.By leave-one-out-cross-validation,the newly developed ML models can be used to determine the performance of most kinds of energetic materials except cubanes.Our ML models are applied to predict ρ_(crystal) and H_(f,solid) of CHON-based molecules of the 150 million sized PubChem database,and screened out 56 candidates with competitive detonation performance and reasonable chemical structures.With further improvement in future,spatial matrices have the potential of becoming multifunctional ML simulation tools that could provide even better predictions in wider fields of materials science. 展开更多
关键词 Small database machine learning Energetic materials screening Spatial matrix featurization method Crystal density Formation enthalpy n-Body interactions
下载PDF
POTENTIAL: A Highly Adaptive Core of Parallel Database System
2
作者 文继荣 陈红 王珊 《Journal of Computer Science & Technology》 SCIE EI CSCD 2000年第6期527-541,共15页
POTENTIAL is a virtual database machine based on general computing platforms, especially parallel computing platforms. It provides a complete solution to high-performance database systems by a 'virtual processor ... POTENTIAL is a virtual database machine based on general computing platforms, especially parallel computing platforms. It provides a complete solution to high-performance database systems by a 'virtual processor + virtual data bus + virtual memory' architecture. Virtual processors manage all CPU resources in the system, on which various operations are running. Virtual data bus is responsible for the management of data transmission between associated operations, which forms the hinges of the entire system. Virtual memory provides efficient data storage and buffering mechanisms that conform to data reference behaviors in database systems. The architecture of POTENTIAL is very clear and has many good features, including high efficiency, high scalability, high extensibility, high portability, etc. 展开更多
关键词 virtual database machine virtual data bus virtual processor virtual memory parallel database
原文传递
Efficient Model Store and Reuse in an OLML Database System
3
作者 Jian-Wei Cui Wei Lu +1 位作者 Xin Zhao Xiao-Yong Du 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第4期792-805,共14页
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network models.Yet,for these neural network models,it is necessary to label a tremendous amou... Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network models.Yet,for these neural network models,it is necessary to label a tremendous amount of training data,which is prohibitively expensive in reality.In this paper,we propose OnLine Machine Learning(OLML)database which stores trained models and reuses these models in a new training task to achieve a better training effect with a small amount of training data.An efficient model reuse algorithm AdaReuse is developed in the OLML database.Specifically,AdaReuse firstly estimates the reuse potential of trained models from domain relatedness and model quality,through which a group of trained models with high reuse potential for the training task could be selected efficiently.Then,multi selected models will be trained iteratively to encourage diverse models,with which a better training effect could be achieved by ensemble.We evaluate AdaReuse on two types of natural language processing(NLP)tasks,and the results show AdaReuse could improve the training effect significantly compared with models training from scratch when the training data is limited.Based on AdaReuse,we implement an OLML database prototype system which could accept a training task as an SQL-like query and automatically generate a training plan by selecting and reusing trained models.Usability studies are conducted to illustrate the OLML database could properly store the trained models,and reuse the trained models efficiently in new training tasks. 展开更多
关键词 model selection model reuse OnLine machine Learning(OLML)database
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部