期刊文献+

基于RoBERTa-Effg-Adv的实体关系联合抽取方法

Entity and Relation Joint Extraction Method Based on RoBERTa-Effg-Adv
下载PDF
导出
摘要 实体关系抽取是构建知识图谱的关键步骤,其目的是抽取文本中的关系三元组。针对现有中文实体关系联合抽取模型无法有效抽取重叠关系三元组及提取性能不足的问题,该文提出了RoBERTa-Effg-Adv的实体关系联合抽取模型,其编码端采用RoBERTa-wwm-ext预训练模型对输入数据进行编码,并采用Efficient GlobalPointer模型来处理嵌套和非嵌套命名实体识别,将实体关系三元组拆分成五元组进行实体关系联合抽取。再结合对抗训练,提升模型的鲁棒性。为了获得机器可读的语料库,对相关文本书籍进行扫描,并进行光学字符识别,再通过人工标注数据的方式,形成该研究所需要的关系抽取数据集REDQTTM,该数据集包含18种实体类型和11种关系类型。实验结果验证了该方法在瞿昙寺壁画领域的中文实体关系联合抽取任务的有效性,在REDQTTM测试集上的精确率达到了94.0%,召回率达到了90.7%,F1值达到了92.3%,相比GPLinker模型,在精确率、召回率和F1值上分别提高了2.4百分点、0.9百分点、1.6百分点。 Entity and relation extraction is a key step in constructing knowledge graph,its purpose is to extract the relation triples in the text.Aiming at the problem that the current Chinese entity relation joint extraction model cannot effectively extract overlapping relation triples and the extraction performance is insufficient,we propose a entity and relation joint extraction model based on RoBERTa-Effg-Adv.At the encoder,the RoBERTa-wwm-ext pre-training model is used to encode the input data,and the Efficient GlobalPointer model is used to process nested and non-nested named entity recognition.The entity and relation triple is split into five tuples for entity and relation joint extraction.Combined with adversarial training,the robustness of the model is improved.In order to obtain machine-readable corpus,the relevant books are scanned,and optical character recognition is performed,and then the relation extraction dataset REDQTTM required by this study is formed by manually labeling the data.The dataset contains 18 entity types and 11 relationship types.The experimental results verify the effectiveness of the proposed method in the task of entity and relation joint extraction in the field of Qu Tan temple murals.The precision on the test set of REDQTTM reaches 94.0%,the recall reaches 90.7%,and the F1 value reaches 92.3%.Compared with the GPLinker model,the precision,recall and F1 value are improved by 2.4%,0.9%and 1.6%respectively.
作者 姚飞杨 刘晓静 YAO Fei-yang;LIU Xiao-jing(Department of Computer Technology and Application,Qinghai University,Xining 810016,China)
出处 《计算机技术与发展》 2024年第3期147-154,共8页 Computer Technology and Development
基金 青海省2021年应用基础研究计划项目(2021-ZJ-717)。
关键词 RoBERTa-wwm-ext 对抗训练 关系抽取 Efficient GlobalPointer 中文实体 RoBERTa-wwm-ext adversarial training relation extraction Efficient GlobalPointer Chinese entity
  • 相关文献

参考文献10

二级参考文献54

共引文献226

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部