摘要
实体关系联合抽取是信息抽取的一项重要任务。由于传统的实体关系联合抽取方法把实体之间的关系建模为离散类型,因此不能很好地解决重叠三元组的问题。为了解决难以抽取重叠三元组的问题,本文提出一种融合FGM和指针标注的实体关系联合抽取BERT-FGM模型。该模型将实体之间的关系建模为函数,通过在BERT训练词向量的过程中融入FGM提高模型的鲁棒性。模型首先通过指针标注策略抽取头实体,然后将头实体与句子向量进行融合作为一个新向量,最终将其在预定义的关系条件下抽取头实体对应的尾实体。实验使用的是公开数据集WebNLG,实验结果表明该模型F1值达到90.7%,有效地解决了三元组重叠问题。
Joint extraction of entities and relations is an important task of information extraction.The traditional entity relationship joint extraction method cannot solve the problem of overlapping triples well,because it models the relationship between entities as discrete types.In order to solve the problem that it is difficult to extract overlapping triples,this paper proposes a BERT-FGM model for entity relationship joint extraction,which combines FGM and pointer annotation.In this model,the relationship between entities is modeled as a function,and the robustness of the model is improved by incorporating FGM into the process of BERT training word vector.The model firstly extracts the subjects through the pointer annotation strategy,then fuses the subjects into a sentence vector as a new vector,and finally uses it to extract objects under a predefined relationship condition.Experiments are carried out on public dataset WebNLG,the experimental result shows that the F1 value of the model is 90.7%,it can effectively solve the problem of relationship triples overlapping.
作者
刘玉鹏
葛艳
杜军威
陈卓
LIU Yu-peng;GE Yan;DU Jun-wei;CHEN Zhuo(School of Information Science and Technology,Qingdao University of Science and Technology,Qingdao 266061,China)
出处
《计算机与现代化》
2023年第11期1-5,12,共6页
Computer and Modernization
基金
国家自然科学基金资助项目(61973180,61273180)
山东省重点研究计划项目(2018GGX101052)
山东省自然科学基金资助项目(ZR2019MF033,ZR2021MF092)。
关键词
实体关系联合抽取
重叠三元组
BERT
FGM
指针标注
joint extraction of entities and relations
overlapping triples
BERT
FGM
pointer annotation