摘要
针对军事文本实体关系抽取过程中存在的“一句对应多个三元组”,“一个主语对应多个客体”等问题提出一种基于ERNIE的军事文本三元组抽取模型,在编码层引入ERNIE模型获取每个词的编码序列,参考seq-to-seq解码器的建模方法和BIO序列标注,采用先预测主体,再传入主体标注序列预测客体和二者之间关系的方法实现三元组的抽取。在预测层使用sigmoid实现多主体、多客体甚至多关系的提取。实验结果证明,人工标注的军事新闻数据集上,该模型的抽取效果明显优于基于循环神经网络的流水线抽取模型和基于BERT的联合实体关系抽取模型,F1值达到80.04%。
Aiming at the problems of“one sentence corresponds to multiple triples”and“one subject corresponds to multiple objects”in the process of entity relationship extraction from military text,this paper proposed a military text relation extraction model based on Ernie.In this model’s coding layer,Ernie model is introduced to obtain the coding sequence of each word.Referring to the modeling method of seq-to-seq decoder,this model firstly predicts the subject and then introducing the subject to predict the triple.In this model’s prediction layer,sigmoid is used to extract multi-agent,multi object and even multi relationship.The experimental results show that,on the manually annotated military news dataset,the extraction effect of this model is obviously better than that of pipeline extraction model based on cyclic neural network and joint entity relationship extraction model based on Bert,and the F 1 value reaches 80.04%.
作者
郑杜福
黄蔚
任祥辉
ZHENG Du-fu;HUANG Wei;REN Xiang-hui(North China Institute of Computing Technology,Beijing 100083,China)
出处
《信息技术》
2021年第2期38-43,共6页
Information Technology
基金
公共安全风险防控与应急技术装备(2018YFC0831-200)。