摘要
实体关系三元组的抽取效果直接影响后期知识图谱构建的质量,而传统流水线式和联合式抽取的模型,并没有对句子级别和关系级别的语义特征进行有效建模,从而导致模型性能的缺失。为此,提出一种融合句子级别和关系级别的交互注意力网络的实体和关系联合抽取模型RSIAN,该模型通过交互注意力网络来学习句子级别和关系级别的高阶语义关联,增强句子和关系之间的交互,辅助模型进行抽取决策。在构建的中文旅游数据集(TDDS)的Precision、Recall和F1值分别为0.872、0.760和0.812,其性能均优于其他对比模型;为了进一步验证该模型在英文联合抽取上的性能,在公开英文数据集NYT和Webnlg上进行实验,该模型的F1值相比基线模型RSAN模型分别提高了0.014和0.013,并且该模型在重叠三元组的分析实验也均取得了优于基线模型的性能且更稳定。
Entity relationship triples extraction effect has a direct impact on the construction of knowledge graphs in the later stage.The traditional pipeline and joint extraction models do not effectively model the semantic features at sentence level and relationship level,which leads to the lack of model performance.To this end,a joint entity and relation extrac-tion model RSIAN that fuses the semantic features at the sentence level and relation level is proposed,which learns the higher-order semantic associations at the sentence level and relation level through an interactive attention network to enhance the interaction between sentences and relations and assist the model in extraction decisions.The precision,recall,and F1 values of the Chinese tourism dataset(TDDS)constructs in this paper are 0.872,0.760,and 0.812,respectively,all of which outperform the current mainstream model.To further validate the performance of the model on joint extraction in English,experiments are conducted on the publicly available English datasets NYT and Webnlg.The F1 values of the model compared to the baseline RSAN model are increased by 0.014 and 0.013,respectively,and this model also achieves better performance than the baseline model in the analysis experiments of overlapping triads.
作者
郝小芳
张超群
李晓翔
王大睿
HAO Xiaofang;ZHANG Chaoqun;LI Xiaoxiang;WANG Darui(College of Electronic Information,Guangxi Minzu University,Nanning 530006,China;College of Artificial Intelligence,Guangxi Minzu University,Nanning 530006,China;Guangxi Key Laboratory of Hybrid Computation and IC Design Analysis,Nanning 530006,China)
出处
《计算机工程与应用》
CSCD
北大核心
2024年第8期156-164,共9页
Computer Engineering and Applications
基金
国家自然科学基金(62062011)
广西自然科学基金(2019GXNSFAA185017)
广西民族大学研究生科研创新项目(gxun-chxs2021066)。
关键词
交互注意力网络
句子级别
关系级别
实体和关系联合抽取
注意力机制
重叠三元组
interactive attention network
sentence-level
relationship-level
joint extraction model of entity and relation
attention mechanism
overlapping triple