Annotation and Joint Extraction of Scientific Entities and Relationships in NSFC Project Texts

导出

摘要 Aiming at the lack of classification and good standard corpus in the task of joint entity and relationship extraction in the current Chinese academic field, this paper builds a dataset in management science that can be used for joint entity and relationship extraction, and establishes a deep learning model to extract entity and relationship information from scientific texts. With the definition of entity and relation classification, we build a Chinese scientific text corpus dataset based on the abstract texts of projects funded by the National Natural Science Foundation of China(NSFC) in 2018–2019. By combining the word2vec features with the clue word feature which is a kind of special style in scientific documents, we establish a joint entity relationship extraction model based on the Bi LSTM-CNN-CRF model for scientific information extraction. The dataset we constructed contains 13060 entities(not duplicated) and 9728 entity relation labels. In terms of entity prediction effect, the accuracy rate of the constructed model reaches 69.15%, the recall rate reaches 61.03%, and the F1 value reaches 64.83%. In terms of relationship prediction effect, the accuracy rate is higher than that of entity prediction, which reflects the effectiveness of the input mixed features and the integration of local features with CNN layer in the model.

作者 Zhiyuan GE Xiaoxi QI Fei WANG Tingli LIU Jun GUAN Xiaohong HUANG Yong SHAO Yingmin WU

机构地区 School of Economics and Management Office of Academic Affairs Faculty of Information Technology School of Management

出处《Journal of Systems Science and Information》 CSCD 2023年第4期466-487,共22页 系统科学与信息学报（英文）

基金 Supported by the National Natural Science Foundation of China (71804017) the R&D Program of Beijing Municipal Education Commission (KZ202210005013) the Sichuan Social Science Planning Project (SC22B151)。

关键词 joint extraction of entities and relations deep learning Chinese scientific information extraction

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1Les relations Chine-Europe ont un avenir prometteur[J].今日中国（法文版）,2023,61(2):36-39.
2Guide for Authors[J].Science Bulletin,2023,68(13).
3JunPing Qiu.WORLD CLASS UNIVERSITY RANKING: GLOBAL VIEW OF CHINESE SCIENTIFIC EVALUATION[J].评价与管理,2022,20(S01):1-1.
4柳伟.基于超星平台的课程一体化教学系统设计[J].信息与电脑,2023,35(5):254-256. 被引量：1
5黄梦林,段磊,张袁昊,王培妍,李仁昊.基于Prompt学习的无监督关系抽取模型[J].计算机应用,2023,43(7):2010-2016.
6YAN Li-jiao,FANG Min,ZHU Si-jia,WANG Zhi-jie,HU Xiao-yang,LIANG Shi-bing,WANG Dou,YANG Dan,SHEN Chen,Nicola Robinson,LIU Jian-ping.Effectiveness and Safety of Chinese Medicine at Shenque(CV 8)for Primary Dysmenorrhea:A Systematic Review and Meta-Analysis of Randomized Controlled Trials[J].Chinese Journal of Integrative Medicine,2023,29(4):341-352. 被引量：2
7Zhiqiang Hu,Zheng Ma,Jun Shi,Zhipeng Li,Xun Shao,Yangzhao Yang,Yong Liao,Zhenyuan Gao,Jie Zhang.A Top-down Method of Extraction Entity Relationship Triples and Obtaining Annotated Data[J].Journal of Quantum Computing,2022,4(1):13-22.
8黄颖,虞逸飞,郑寅鑫,朱芸畅,张琳.基于学科分类和文本主题的科学基金项目与产出论文目标一致性识别研究[J].情报学报,2023,42(8):893-905. 被引量：3
9朱秀宝,周刚,陈静,卢记仓,向怡馨.基于增强序列标注策略的单阶段联合实体关系抽取方法[J].计算机科学,2023,50(8):184-192. 被引量：6
10Xingyi Fang,Yi Gong,Yanlin Ma,Yuanhua Huang.Application of next-generation sequencing in thalassemia screening:A systematic review and meta-analysis[J].Asian Pacific Journal of Tropical Medicine,2023,16(2):51-57.

Journal of Systems Science and Information

2023年第4期

浏览历史

内容加载中请稍等...

Annotation and Joint Extraction of Scientific Entities and Relationships in NSFC Project Texts

相关作者

相关机构

相关主题

浏览历史