摘要
针对传统实体关系抽取准确率不高,依赖人工标注且未能充分利用句子和目标实体语义的问题,提出一种预训练卷积神经网络模型(R-BERT-CNN)。将实体级信息融入预训练模型获取目标实体的语义;采用CNN提取句子级的语义信息;连接句子向量、标签向量和目标实体向量,获得全局信息;通过softmax分类器抽取实体关系。实验结果显示,在SemEval 2010 Task 8数据集上F1值达到了89.51%,比Attention-CNN、Att-Pooling-CNN模型分别提高3.61百分点和1.51百分点;比单独获取句子语义或目标实体语义的R-Bert、Bert-CNN模型分别提高2.61百分点和0.97百分点,训练时间分别缩短15和19 min。
Aimed at the problem of low accuracy of traditional entity relationship extraction,relying on manual annotation and failing to make full use of the semantics of sentences and target entities,a pre-trained convolutional neural network model(R-BERT-CNN)is proposed.The model integrated the entity-level information into the pre-training model to obtain the semantics of the target entity.CNN was used to extract the sentence-level semantic information.The sentence vector,label vector and target entity vector was connected to obtain global information.The entity relationship was extracted through the softmax classifier.The experimental results show that the F1 value on the SemEval 2010 Task 8 dataset reaches 89.51%,which is 3.61 percentage points and 1.51 percentage points higher than Attention-CNN and Att-Pooling-CNN models.Compared with the R-Bert and Bert-CNN models that obtain the semantics of the sentence or the target entity separately,the F1 value is improved by 2.61 percentage points and 0.97 percentage points respectively,and the training time is shortened by 15 min and 19 min respectively.
作者
曹卫东
徐秀丽
Cao Weidong;Xu Xiuli(School of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China)
出处
《计算机应用与软件》
北大核心
2023年第4期222-229,共8页
Computer Applications and Software
基金
国家自然科学基金民航联合基金项目(U1833114)
民航科技创新重大专项(MHRD20160109)
民航安全能力项目(TRSA201803)。
关键词
预训练模型
BERT
卷积神经网络
自然语言处理
关系抽取
Pre-trained model
BERT
Convolutional neural network
Natural language processing
Relation extraction