Accelerated materials development with machine learning(ML)assisted screening and high throughput experimentation for new photovoltaic materials holds the key to addressing our grand energy challenges.Data-driven ML i...Accelerated materials development with machine learning(ML)assisted screening and high throughput experimentation for new photovoltaic materials holds the key to addressing our grand energy challenges.Data-driven ML is envisaged as a decisive enabler for new perovskite materials discovery.However,its full potential can be severely curtailed by poorly represented molecular descriptors(or fingerprints).Optimal descriptors are essential for establishing effective mathematical representations of quantitative structure-property relationships.Here we reveal that our persistent functions(PFs)based learning models offer significant accuracy advantages over traditional descriptor based models in organic-inorganic halide perovskite(OIHP)materials design and have similar performance as deep learning models.Our multiscale simplicial complex approach not only provides a more precise representation for OIHP structures and underlying interactions,but also has better transferability to ML models.Our results demonstrate that advanced geometrical and topological invariants are highly efficient feature engineering approaches that can markedly improve the performance of learning models for molecular data analysis.Further,new structure-property relationships can be established between our invariants and bandgaps.We anticipate that our molecular representations and featurization models will transcend the limitations of conventional approaches and lead to breakthroughs in perovskite materials design and discovery.展开更多
基金This work was supported in part by Nanyang Technological University Startup Grant M4081842.110Singapore Ministry of Education Academic Research fund Tier 1 grant RG109/19 and Tier 2 grants MOE-T2EP50120-0004 and MOE-T2EP20120-0013the National Research Foundation(NRF),Singapore under its NRF Investigatorship(NRF-NRFI2018-04).
文摘Accelerated materials development with machine learning(ML)assisted screening and high throughput experimentation for new photovoltaic materials holds the key to addressing our grand energy challenges.Data-driven ML is envisaged as a decisive enabler for new perovskite materials discovery.However,its full potential can be severely curtailed by poorly represented molecular descriptors(or fingerprints).Optimal descriptors are essential for establishing effective mathematical representations of quantitative structure-property relationships.Here we reveal that our persistent functions(PFs)based learning models offer significant accuracy advantages over traditional descriptor based models in organic-inorganic halide perovskite(OIHP)materials design and have similar performance as deep learning models.Our multiscale simplicial complex approach not only provides a more precise representation for OIHP structures and underlying interactions,but also has better transferability to ML models.Our results demonstrate that advanced geometrical and topological invariants are highly efficient feature engineering approaches that can markedly improve the performance of learning models for molecular data analysis.Further,new structure-property relationships can be established between our invariants and bandgaps.We anticipate that our molecular representations and featurization models will transcend the limitations of conventional approaches and lead to breakthroughs in perovskite materials design and discovery.