The Materials Genome Initiative requires the crossing of material calculations,machine learning,and experiments to accelerate the material development process.In recent years,data-based methods have been applied to th...The Materials Genome Initiative requires the crossing of material calculations,machine learning,and experiments to accelerate the material development process.In recent years,data-based methods have been applied to the thermoelectric field,mostly on the transport properties.In this work,we combined data-driven machine learning and first-principles automated calculations into an active learning loop,in order to predict the p-type power factors(PFs)of diamond-like pnictides and chalcogenides.Our active learning loop contains two procedures(1)based on a high-throughput theoretical database,machine learning methods are employed to select potential candidates and(2)computational verification is applied to these candidates about their transport properties.The verification data will be added into the database to improve the extrapolation abilities of the machine learning models.Different strategies of selecting candidates have been tested,finally the Gradient Boosting Regression model of Query by Committee strategy has the highest extrapolation accuracy(the Pearson R=0.95 on untrained systems).Based on the prediction from the machine learning models,binary pnictides,vacancy,and small atom-containing chalcogenides are predicted to have large PFs.The bonding analysis reveals that the alterations of anionic bonding networks due to small atoms are beneficial to the PFs in these compounds.展开更多
基金This work was supported by the National Key Research and Development Program of China(Nos.2018YFB0703600 and 2017YFB0701600)Natural Science Foundation of China(Grant Nos.11674211,51632005,and 51761135127)+3 种基金the 111 Project D16002.W.Z.also acknowledges the support from the Guangdong Innovation Research Team Project(No.2017ZT07C062)Guangdong Provincial Key-Lab program(No.2019B030301001)Shenzhen Municipal Key-Lab program(ZDSYS20190902092905285)Shenzhen Pengcheng-Scholarship Program.Part of the calculations were supported by Center for Computational Science and Engineering at Southern University of Science and Technology.
文摘The Materials Genome Initiative requires the crossing of material calculations,machine learning,and experiments to accelerate the material development process.In recent years,data-based methods have been applied to the thermoelectric field,mostly on the transport properties.In this work,we combined data-driven machine learning and first-principles automated calculations into an active learning loop,in order to predict the p-type power factors(PFs)of diamond-like pnictides and chalcogenides.Our active learning loop contains two procedures(1)based on a high-throughput theoretical database,machine learning methods are employed to select potential candidates and(2)computational verification is applied to these candidates about their transport properties.The verification data will be added into the database to improve the extrapolation abilities of the machine learning models.Different strategies of selecting candidates have been tested,finally the Gradient Boosting Regression model of Query by Committee strategy has the highest extrapolation accuracy(the Pearson R=0.95 on untrained systems).Based on the prediction from the machine learning models,binary pnictides,vacancy,and small atom-containing chalcogenides are predicted to have large PFs.The bonding analysis reveals that the alterations of anionic bonding networks due to small atoms are beneficial to the PFs in these compounds.