Propelled partly by the Materials Genome Initiative,and partly by the algorithmic developments and the resounding successes of data-driven efforts in other domains,informatics strategies are beginning to take shape wi...Propelled partly by the Materials Genome Initiative,and partly by the algorithmic developments and the resounding successes of data-driven efforts in other domains,informatics strategies are beginning to take shape within materials science.These approaches lead to surrogate machine learning models that enable rapid predictions based purely on past data rather than by direct experimentation or by computations/simulations in which fundamental equations are explicitly solved.Data-centric informatics methods are becoming useful to determine material properties that are hard to measure or compute using traditional methods—due to the cost,time or effort involved—but for which reliable data either already exists or can be generated for at least a subset of the critical cases.Predictions are typically interpolative,involving fingerprinting a material numerically first,and then following a mapping(established via a learning algorithm)between the fingerprint and the property of interest.Fingerprints,also referred to as“descriptors”,may be of many types and scales,as dictated by the application domain and needs.Predictions may also be extrapolative—extending into new materials spaces—provided prediction uncertainties are properly taken into account.This article attempts to provide an overview of some of the recent successful data-driven“materials informatics”strategies undertaken in the last decade,with particular emphasis on the fingerprint or descriptor choices.The review also identifies some challenges the community is facing and those that should be overcome in the near future.展开更多
The authors became aware of a mistake in the original version of this Article.Specifically,some of the band gap values plotted and reported in Fig.1c and Table SI-1 were incorrect.This error originated because two dif...The authors became aware of a mistake in the original version of this Article.Specifically,some of the band gap values plotted and reported in Fig.1c and Table SI-1 were incorrect.This error originated because two different types of k-point meshes were used in DFT computations performed on CdTe,CdSe and CdS:one which is gamma-centered and one which is not gamma-centered.展开更多
The ability to predict the likelihood of impurity incorporation and their electronic energy levels in semiconductors is crucial for controlling its conductivity,and thus the semiconductor’s performance in solar cells...The ability to predict the likelihood of impurity incorporation and their electronic energy levels in semiconductors is crucial for controlling its conductivity,and thus the semiconductor’s performance in solar cells,photodiodes,and optoelectronics.The difficulty and expense of experimental and computational determination of impurity levels makes a data-driven machine learning approach appropriate.In this work,we show that a density functional theory-generated dataset of impurities in Cd-based chalcogenides CdTe,CdSe,and CdS can lead to accurate and generalizable predictive models of defect properties.展开更多
基金financial support from several grants from the Office of Naval Research that allowed them to explore many applications of machine learning within materials science,including N00014-14-1-0098,N00014-16-1-2580,and N00014-10-1-0944.
文摘Propelled partly by the Materials Genome Initiative,and partly by the algorithmic developments and the resounding successes of data-driven efforts in other domains,informatics strategies are beginning to take shape within materials science.These approaches lead to surrogate machine learning models that enable rapid predictions based purely on past data rather than by direct experimentation or by computations/simulations in which fundamental equations are explicitly solved.Data-centric informatics methods are becoming useful to determine material properties that are hard to measure or compute using traditional methods—due to the cost,time or effort involved—but for which reliable data either already exists or can be generated for at least a subset of the critical cases.Predictions are typically interpolative,involving fingerprinting a material numerically first,and then following a mapping(established via a learning algorithm)between the fingerprint and the property of interest.Fingerprints,also referred to as“descriptors”,may be of many types and scales,as dictated by the application domain and needs.Predictions may also be extrapolative—extending into new materials spaces—provided prediction uncertainties are properly taken into account.This article attempts to provide an overview of some of the recent successful data-driven“materials informatics”strategies undertaken in the last decade,with particular emphasis on the fingerprint or descriptor choices.The review also identifies some challenges the community is facing and those that should be overcome in the near future.
文摘The authors became aware of a mistake in the original version of this Article.Specifically,some of the band gap values plotted and reported in Fig.1c and Table SI-1 were incorrect.This error originated because two different types of k-point meshes were used in DFT computations performed on CdTe,CdSe and CdS:one which is gamma-centered and one which is not gamma-centered.
基金We acknowledge funding from the US Department of Energy SunShot program under contract DOE DEEE005956Use of the Center for Nanoscale Materials,an Office of Science user facility,was supported by the U.S.Department of Energy,Office of Science,Office of Basic Energy Sciences,under Contract No.DE-AC02-06CH11357+2 种基金This research used resources of the National Energy Research Scientific Computing Center,a DOE Office of Science User Facility supported by the Office of Science of the U.S.Department of Energy under Contract No.DE-AC02-05CH11231M.Y.T.would like to acknowledge support from the U.S.Department of Energy,Office of Science,Office of Workforce Development for Teachers and Scientists(WDTS)under the Science Undergraduate Laboratory Internship(SULI)programM.J.D.was was supported by the U.S.Department of Energy,Office of Basic Energy Sciences,Division of Chemical Sciences,Geosciences,and Biosciences,under Contract No.DE-AC02-06CH11357.
文摘The ability to predict the likelihood of impurity incorporation and their electronic energy levels in semiconductors is crucial for controlling its conductivity,and thus the semiconductor’s performance in solar cells,photodiodes,and optoelectronics.The difficulty and expense of experimental and computational determination of impurity levels makes a data-driven machine learning approach appropriate.In this work,we show that a density functional theory-generated dataset of impurities in Cd-based chalcogenides CdTe,CdSe,and CdS can lead to accurate and generalizable predictive models of defect properties.