摘要
There is growing interest in applying machine learning techniques in the field of materials science.However,the interpretation and knowledge extracted from machine learning models is a major concern,particularly as formulating an explicit model that provides insight into physics is the goal of learning.In the present study,we propose a framework that utilizes the filtering ability of feature engineering,in conjunction with symbolic regression to extract explicit,quantitative expressions for the band gap energy from materials data.We propose enhancements to genetic programming with dimensional consistency and artificial constraints to improve the search efficiency of symbolic regression.We show how two descriptors attributed to volumetric and electronic factors,from 32 possible candidates,explicitly express the band gap energy of Na Cl-type compounds.Our approach provides a basis to capture underlying physical relationships between materials descriptors and target properties.
基金
financially supported by the National Key Research and Development Program of China(No.2016YFB0700500)
the Guangdong Province Key Area R&D Program(No.2019B010940001)。