期刊文献+

A general-purpose material property data extraction pipeline from large polymer corpora using natural language processing

原文传递
导出
摘要 The ever-increasing number of materials science articles makes it hard to infer chemistry-structure-property relations from literature.We used natural language processing methods to automatically extract material property data from the abstracts of polymer literature.As a component of our pipeline,we trained MaterialsBERT,a language model,using 2.4 million materials science abstracts,which outperforms other baseline models in three out of five named entity recognition datasets.Using this pipeline,we obtained~300,000 material property records from~130,000 abstracts in 60 hours.The extracted data was analyzed for a diverse range of applications such as fuel cells,supercapacitors,and polymer solar cells to recover non-trivial insights.The data extracted through our pipeline is made available at polymerscholar.org which can be used to locate material property data recorded in abstracts.This work demonstrates the feasibility of an automatic pipeline that starts from published literature and ends with extracted material property information.
出处 《npj Computational Materials》 SCIE EI CSCD 2023年第1期1826-1837,共12页 计算材料学(英文)
基金 This work was supported by the Office of Naval Research through grants N00014-19-1-2103 and N00014-20-1-2175.Helpful discussions and feedback from Dr.Lihua Chen are acknowledged.Pranav Shetty was partially funded by a fellowship by JPMorgan Chase&Co.that helped to support this research.Any views or opinions expressed herein are solely those of the authors listed,and may differ from the views and opinions expressed by JPMorgan Chase&Co.or its affiliates.
  • 相关文献

参考文献1

共引文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部