Propelled partly by the Materials Genome Initiative,and partly by the algorithmic developments and the resounding successes of data-driven efforts in other domains,informatics strategies are beginning to take shape wi...Propelled partly by the Materials Genome Initiative,and partly by the algorithmic developments and the resounding successes of data-driven efforts in other domains,informatics strategies are beginning to take shape within materials science.These approaches lead to surrogate machine learning models that enable rapid predictions based purely on past data rather than by direct experimentation or by computations/simulations in which fundamental equations are explicitly solved.Data-centric informatics methods are becoming useful to determine material properties that are hard to measure or compute using traditional methods—due to the cost,time or effort involved—but for which reliable data either already exists or can be generated for at least a subset of the critical cases.Predictions are typically interpolative,involving fingerprinting a material numerically first,and then following a mapping(established via a learning algorithm)between the fingerprint and the property of interest.Fingerprints,also referred to as“descriptors”,may be of many types and scales,as dictated by the application domain and needs.Predictions may also be extrapolative—extending into new materials spaces—provided prediction uncertainties are properly taken into account.This article attempts to provide an overview of some of the recent successful data-driven“materials informatics”strategies undertaken in the last decade,with particular emphasis on the fingerprint or descriptor choices.The review also identifies some challenges the community is facing and those that should be overcome in the near future.展开更多
Simulations based on solving the Kohn-Sham(KS)equation of density functional theory(DFT)have become a vital component of modern materials and chemical sciences research and development portfolios.Despite its versatili...Simulations based on solving the Kohn-Sham(KS)equation of density functional theory(DFT)have become a vital component of modern materials and chemical sciences research and development portfolios.Despite its versatility,routine DFT calculations are usually limited to a few hundred atoms due to the computational bottleneck posed by the KS equation.Here we introduce a machine-learning-based scheme to efficiently assimilate the function of the KS equation,and by-pass it to directly,rapidly,and accurately predict the electronic structure of a material or a molecule,given just its atomic configuration.A new rotationally invariant representation is utilized to map the atomic environment around a grid-point to the electron density and local density of states at that grid-point.This mapping is learned using a neural network trained on previously generated reference DFT results at millions of grid-points.The proposed paradigm allows for the high-fidelity emulation of KS DFT,but orders of magnitude faster than the direct solution.Moreover,the machine learning prediction scheme is strictly linear-scaling with system size.展开更多
Emerging machine learning(ML)-based approaches provide powerful and novel tools to study a variety of physical and chemical problems.In this contribution,we outline a universal strategy to create ML-based atomistic fo...Emerging machine learning(ML)-based approaches provide powerful and novel tools to study a variety of physical and chemical problems.In this contribution,we outline a universal strategy to create ML-based atomistic force fields,which can be used to perform high-fidelity molecular dynamics simulations.This scheme involves(1)preparing a big reference dataset of atomic environments and forces with sufficiently low noise,e.g.,using density functional theory or higher-level methods,(2)utilizing a generalizable class of structural fingerprints for representing atomic environments,(3)optimally selecting diverse and nonredundant training datasets from the reference data,and(4)proposing various learning approaches to predict atomic forces directly(and rapidly)from atomic configurations.From the atomistic forces,accurate potential energies can then be obtained by appropriate integration along a reaction coordinate or along a molecular dynamics trajectory.Based on this strategy,we have created model ML force fields for six elemental bulk solids,including Al,Cu,Ti,W,Si,and C,and show that all of them can reach chemical accuracy.The proposed procedure is general and universal,in that it can potentially be used to generate ML force fields for any material using the same unified workflow with little human intervention.Moreover,the force fields can be systematically improved by adding new training data progressively to represent atomic environments not encountered previously.展开更多
The dielectric constant(ϵ)is a critical parameter utilized in the design of polymeric dielectrics for energy storage capacitors,microelectronic devices,and high-voltage insulations.However,agile discovery of polymer d...The dielectric constant(ϵ)is a critical parameter utilized in the design of polymeric dielectrics for energy storage capacitors,microelectronic devices,and high-voltage insulations.However,agile discovery of polymer dielectrics with desirableϵremains a challenge,especially for high-energy,high-temperature applications.To aid accelerated polymer dielectrics discovery,we have developed a machine-learning(ML)-based model to instantly and accurately predict the frequency-dependentϵof polymers with the frequency range spanning 15 orders of magnitude.Our model is trained using a dataset of 1210 experimentally measuredϵvalues at different frequencies,an advanced polymer fingerprinting scheme and the Gaussian process regression algorithm.展开更多
基金financial support from several grants from the Office of Naval Research that allowed them to explore many applications of machine learning within materials science,including N00014-14-1-0098,N00014-16-1-2580,and N00014-10-1-0944.
文摘Propelled partly by the Materials Genome Initiative,and partly by the algorithmic developments and the resounding successes of data-driven efforts in other domains,informatics strategies are beginning to take shape within materials science.These approaches lead to surrogate machine learning models that enable rapid predictions based purely on past data rather than by direct experimentation or by computations/simulations in which fundamental equations are explicitly solved.Data-centric informatics methods are becoming useful to determine material properties that are hard to measure or compute using traditional methods—due to the cost,time or effort involved—but for which reliable data either already exists or can be generated for at least a subset of the critical cases.Predictions are typically interpolative,involving fingerprinting a material numerically first,and then following a mapping(established via a learning algorithm)between the fingerprint and the property of interest.Fingerprints,also referred to as“descriptors”,may be of many types and scales,as dictated by the application domain and needs.Predictions may also be extrapolative—extending into new materials spaces—provided prediction uncertainties are properly taken into account.This article attempts to provide an overview of some of the recent successful data-driven“materials informatics”strategies undertaken in the last decade,with particular emphasis on the fingerprint or descriptor choices.The review also identifies some challenges the community is facing and those that should be overcome in the near future.
基金The authors would like to thank XSEDE for the utilization of Stampede2 cluster via project ID“DMR080058N”This work is supported by the Office of Naval Research through N0014-17-1-2656,a Multi-University Research Initiative(MURI)grant.
文摘Simulations based on solving the Kohn-Sham(KS)equation of density functional theory(DFT)have become a vital component of modern materials and chemical sciences research and development portfolios.Despite its versatility,routine DFT calculations are usually limited to a few hundred atoms due to the computational bottleneck posed by the KS equation.Here we introduce a machine-learning-based scheme to efficiently assimilate the function of the KS equation,and by-pass it to directly,rapidly,and accurately predict the electronic structure of a material or a molecule,given just its atomic configuration.A new rotationally invariant representation is utilized to map the atomic environment around a grid-point to the electron density and local density of states at that grid-point.This mapping is learned using a neural network trained on previously generated reference DFT results at millions of grid-points.The proposed paradigm allows for the high-fidelity emulation of KS DFT,but orders of magnitude faster than the direct solution.Moreover,the machine learning prediction scheme is strictly linear-scaling with system size.
基金supported financially by the Office of Naval Research(Grant No.N00014-14-1-0098)by the National Science Foundation(Grant No.1600218).
文摘Emerging machine learning(ML)-based approaches provide powerful and novel tools to study a variety of physical and chemical problems.In this contribution,we outline a universal strategy to create ML-based atomistic force fields,which can be used to perform high-fidelity molecular dynamics simulations.This scheme involves(1)preparing a big reference dataset of atomic environments and forces with sufficiently low noise,e.g.,using density functional theory or higher-level methods,(2)utilizing a generalizable class of structural fingerprints for representing atomic environments,(3)optimally selecting diverse and nonredundant training datasets from the reference data,and(4)proposing various learning approaches to predict atomic forces directly(and rapidly)from atomic configurations.From the atomistic forces,accurate potential energies can then be obtained by appropriate integration along a reaction coordinate or along a molecular dynamics trajectory.Based on this strategy,we have created model ML force fields for six elemental bulk solids,including Al,Cu,Ti,W,Si,and C,and show that all of them can reach chemical accuracy.The proposed procedure is general and universal,in that it can potentially be used to generate ML force fields for any material using the same unified workflow with little human intervention.Moreover,the force fields can be systematically improved by adding new training data progressively to represent atomic environments not encountered previously.
基金This work is supported by the Office of Naval Research through N0014-17-1-2656,a Multi-University Research Initiative(MURI)grant.
文摘The dielectric constant(ϵ)is a critical parameter utilized in the design of polymeric dielectrics for energy storage capacitors,microelectronic devices,and high-voltage insulations.However,agile discovery of polymer dielectrics with desirableϵremains a challenge,especially for high-energy,high-temperature applications.To aid accelerated polymer dielectrics discovery,we have developed a machine-learning(ML)-based model to instantly and accurately predict the frequency-dependentϵof polymers with the frequency range spanning 15 orders of magnitude.Our model is trained using a dataset of 1210 experimentally measuredϵvalues at different frequencies,an advanced polymer fingerprinting scheme and the Gaussian process regression algorithm.