Recent advances in computing power have enabled the generation of large datasets for materials,enabling data-driven approaches to problem-solving in materials science,including materials discovery.Machine learning is ...Recent advances in computing power have enabled the generation of large datasets for materials,enabling data-driven approaches to problem-solving in materials science,including materials discovery.Machine learning is a primary tool for manipulating such large datasets,predicting unknown material properties and uncovering relationships between structure and property.Among state-of-the-art machine learning algorithms,gradient-boosted regression trees(GBRT)are known to provide highly accurate predictions,as well as interpretable analysis based on the importance of features.Here,in a search for lead-free perovskites for use in solar cells,we applied the GBRT algorithm to a dataset of electronic structures for candidate halide double perovskites to predict heat of formation and bandgap.Statistical analysis of the selected features identifies design guidelines for the discovery of new lead-free perovskites.展开更多
Thermoelectric materials have received much attention as energy harvesting devices and power generators.However,discovering novel high-performance thermoelectric materials is challenging due to the structural diversit...Thermoelectric materials have received much attention as energy harvesting devices and power generators.However,discovering novel high-performance thermoelectric materials is challenging due to the structural diversity and complexity of the thermoelectric materials containing alloys and dopants.For the efficient data-driven discovery of novel thermoelectric materials,we constructed a public dataset that contains experimentally synthesized thermoelectric materials and their experimental thermoelectric properties.For the collected dataset,we were able to construct prediction models that achieved R^(2)-scores greater than 0.9 in the regression problems to predict the experimentally measured thermoelectric properties from the chemical compositions of the materials.Furthermore,we devised a material descriptor for the chemical compositions of the materials to improve the extrapolation capabilities of machine learning methods.Based on transfer learning with the proposed material descriptor,we significantly improved the R^(2)-score from 0.13 to 0.71 in predicting experimental ZTs of the materials from completely unexplored material groups.展开更多
Dopants play an important role in synthesizing materials to improve target materials properties or stabilize the materials.In particular,the dopants are essential to improve thermoelectic performances of the materials...Dopants play an important role in synthesizing materials to improve target materials properties or stabilize the materials.In particular,the dopants are essential to improve thermoelectic performances of the materials.However,existing machine learning methods cannot accurately predict the materials properties of doped materials due to severely nonlinear relations with their materials properties.Here,we propose a unified architecture of neural networks,called DopNet,to accurately predict the materials properties of the doped materials.DopNet identifies the effects of the dopants by explicitly and independently embedding the host materials and the dopants.In our evaluations,DopNet outperformed existing machine learning methods in predicting experimentally measured thermoelectric properties,and the error of DopNet in predicting a figure of merit(ZT)was 0.06 in mean absolute error.In particular,DopNet was significantly effective in an extrapolation problem that predicts ZTs of unknown materials,which is a key task to discover novel thermoelectric materials.展开更多
基金This research was supported by the Nano·Material Technology Development Program through the National Research Foundation of Korea(NRF),funded by the Ministry of Science and ICT(NRF-2016M3A7B4025408 and NRF-2017M3A7B4049366).
文摘Recent advances in computing power have enabled the generation of large datasets for materials,enabling data-driven approaches to problem-solving in materials science,including materials discovery.Machine learning is a primary tool for manipulating such large datasets,predicting unknown material properties and uncovering relationships between structure and property.Among state-of-the-art machine learning algorithms,gradient-boosted regression trees(GBRT)are known to provide highly accurate predictions,as well as interpretable analysis based on the importance of features.Here,in a search for lead-free perovskites for use in solar cells,we applied the GBRT algorithm to a dataset of electronic structures for candidate halide double perovskites to predict heat of formation and bandgap.Statistical analysis of the selected features identifies design guidelines for the discovery of new lead-free perovskites.
基金This study was supported by a project from the Korea Research Institute of Chemical Technology(KRICT)[grant number:SI2151-10].
文摘Thermoelectric materials have received much attention as energy harvesting devices and power generators.However,discovering novel high-performance thermoelectric materials is challenging due to the structural diversity and complexity of the thermoelectric materials containing alloys and dopants.For the efficient data-driven discovery of novel thermoelectric materials,we constructed a public dataset that contains experimentally synthesized thermoelectric materials and their experimental thermoelectric properties.For the collected dataset,we were able to construct prediction models that achieved R^(2)-scores greater than 0.9 in the regression problems to predict the experimentally measured thermoelectric properties from the chemical compositions of the materials.Furthermore,we devised a material descriptor for the chemical compositions of the materials to improve the extrapolation capabilities of machine learning methods.Based on transfer learning with the proposed material descriptor,we significantly improved the R^(2)-score from 0.13 to 0.71 in predicting experimental ZTs of the materials from completely unexplored material groups.
基金This study was supported by a project from the Korea Research Institute of Chemical Technology(KRICT)[grant number:SI2151-10].
文摘Dopants play an important role in synthesizing materials to improve target materials properties or stabilize the materials.In particular,the dopants are essential to improve thermoelectic performances of the materials.However,existing machine learning methods cannot accurately predict the materials properties of doped materials due to severely nonlinear relations with their materials properties.Here,we propose a unified architecture of neural networks,called DopNet,to accurately predict the materials properties of the doped materials.DopNet identifies the effects of the dopants by explicitly and independently embedding the host materials and the dopants.In our evaluations,DopNet outperformed existing machine learning methods in predicting experimentally measured thermoelectric properties,and the error of DopNet in predicting a figure of merit(ZT)was 0.06 in mean absolute error.In particular,DopNet was significantly effective in an extrapolation problem that predicts ZTs of unknown materials,which is a key task to discover novel thermoelectric materials.