Data provides a foundation for machine learning,which has accelerated data-driven materials design.The scientific literature contains a large amount of high-quality,reliable data,and automatically extracting data from...Data provides a foundation for machine learning,which has accelerated data-driven materials design.The scientific literature contains a large amount of high-quality,reliable data,and automatically extracting data from the literature continues to be a challenge.We propose a natural language processing pipeline to capture both chemical composition and property data that allows analysis and prediction of superalloys.Within 3 h,2531 records with both composition and property are extracted from 14,425 articles,coveringγ′solvus temperature,density,solidus,and liquidus temperatures.A data-driven model forγ′solvus temperature is built to predict unexplored Co-based superalloys with highγ′solvus temperatures within a relative error of 0.81%.We test the predictions via synthesis and characterization of three alloys.A web-based toolkit as an online open-source platform is provided and expected to serve as the basis for a general method to search for targeted materials using data extracted from the literature.展开更多
基金This work is financially supported by the National Key Research and Development Program of China(2020YFB0704503,2016YFB0700500)Guangdong Province Key Area R&D Program(2019B010940001)+1 种基金111 Project(B170003)USTB MatCom of Beijing Advanced Innovation Center for Materials Genome Engineering.
文摘Data provides a foundation for machine learning,which has accelerated data-driven materials design.The scientific literature contains a large amount of high-quality,reliable data,and automatically extracting data from the literature continues to be a challenge.We propose a natural language processing pipeline to capture both chemical composition and property data that allows analysis and prediction of superalloys.Within 3 h,2531 records with both composition and property are extracted from 14,425 articles,coveringγ′solvus temperature,density,solidus,and liquidus temperatures.A data-driven model forγ′solvus temperature is built to predict unexplored Co-based superalloys with highγ′solvus temperatures within a relative error of 0.81%.We test the predictions via synthesis and characterization of three alloys.A web-based toolkit as an online open-source platform is provided and expected to serve as the basis for a general method to search for targeted materials using data extracted from the literature.