期刊文献+

基于BERT-BiLSTM-CRF模型的油气领域命名实体识别

Named entity recognition in oil and gas domain based on the BERT-BiLSTM-CRF model
下载PDF
导出
摘要 针对油气领域知识图谱构建过程中命名实体识别使用传统方法存在实体特征信息提取不准确、识别效率低的问题,提出了一种基于BERT-BiLSTM-CRF模型的命名实体识别研究方法。该方法首先利用BERT(bidirectional encoder representations from transformers)预训练模型得到输入序列语义的词向量;然后将训练后的词向量输入双向长短期记忆网络(bi-directional long short-term memory,BiLSTM)模型进一步获取上下文特征;最后根据条件随机场(conditional random fields,CRF)的标注规则和序列解码能力输出最大概率序列标注结果,构建油气领域命名实体识别模型框架。将BERT-BiLSTM-CRF模型与其他2种命名实体识别模型(BiLSTM-CRF、BiLSTM-Attention-CRF)在包括3万多条文本语料数据、4类实体的自建数据集上进行了对比实验。实验结果表明,BERT-BiLSTM-CRF模型的准确率(P)、召回率(R)和F_(1)值分别达到91.3%、94.5%和92.9%,实体识别效果优于其他2种模型。 Aiming at solving problems of insufficient feature information extraction and low recognition efficiency in the construction of knowledge graph in the oil and gas domain,this paper proposes a method for named entity recognition based on the BERT-BiLSTM-CRF model.The method first uses the BERT(bidirectional encoder representations from transformers)pre-training the model to obtain the word vectors of the semantics of the input sequence;Then,further obtains the context characteristics by the input of the trained word vectors into the bi-directional long short-term memory(BiLSTM) model;Finally,according to the labeling rules and sequence decoding ability of conditional random fields(CRF),the maximum probability sequence labeling results are obtained,and a model framework for named entity recognition in the oil and gas field is constructed.This model is compared to two commonly used named entity recognition models using the self-built datasets of more than 30 000 text corpora data and four types of entities,and the experimental results showed that the accuracy(P),recall rate(R) and F_(1) value of the proposed model reached 91.3%,94.5% and 92.9%,respectively,and the entity recognition performance was superior to other two models.
作者 高国忠 李宇 华远鹏 吴文旷 GAO Guozhong;LI Yu;HUA Yuanpeng;WU Wenkuang(College of Geophysics and petroleum resources,Yangtze University,Wuhan 430100,Hubei;Research Institute of Petroleum Exploration and Development,CNPC,Beijing 100083)
出处 《长江大学学报(自然科学版)》 2024年第1期57-65,共9页 Journal of Yangtze University(Natural Science Edition)
基金 教育部中国高校产学研创新基金项目“基于5G+大数据的教育知识图谱平台构建”(2021BCF03006)。
关键词 油气领域 命名实体识别 BERT 双向长短期记忆网络 条件随机场 BERT-BiLSTM-CRF模型 oil and gas domain named entity recognition bidirectional encoder representations from transformers(BERT) bi-directional long short-term memory conditional random fields BERT-BiLSTM-CRF model
  • 相关文献

参考文献28

二级参考文献376

共引文献1918

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部