期刊文献+

基于ERNIE-BiGRU-CRF模型的煤矿安全隐患命名实体智能识别研究

Intelligent recognition of named entities of coal mine safety hiddendanger based on ERNIE-BiGRU-CRF model
下载PDF
导出
摘要 为充分挖掘煤矿安全隐患文本关键知识,帮助煤矿企业安全管理人员更好的开展隐患排查治理工作,提出一种基于预训练语言模型的命名实体识别方法。首先定义煤矿安全隐患实体类别,并采用BIO标注策略构建了7个实体类别和15个实体标签;然后将收集到的煤矿隐患排查数据进行预处理,由煤矿安全领域专家人工标注相关实体,得到1500条煤矿安全隐患命名实体标准数据集;最后采用ERNIE预训练模型对煤矿安全隐患文本词向量进行表征、同时利用BiGRU结构进行上下文语义特征提取以及CRF模型进行实体标签解码,完成煤矿安全隐患命名实体识别研究。实验结果表明:ERNIE-BiGRU-CRF模型在序列标注任务上的精确率、召回率和F1值分别为56.69%、69.23%和62.34%,较于BiLSTM-CRF基线模型分别提高了6.85%、13.74%和9.83%,并且实体抽取结果与实际标注结果相差不大。另外,消融实验也验证了BiGRU层能够更好的捕捉煤矿安全隐患文本上下文语义依赖关系以及CRF层能够进一步优化标签序列的有效性。 In order to fully explore the key text knowledge of coal mine safety hidden danger and help the safety management personnel of coal mine enterprises to better investigate and manage hidden danger,a named entity recognition method based on pre-training language model was proposed.Firstly,entity categories of coal mine safety hidden danger were defined,and 7 entity categories and 15 entity labels were constructed using BIO labeling strategy.Then,the collected data are preprocessed,and relevant entities were manually marked by experts in the field of coal mine safety,and 1500 standard data sets of named entities of coal mine safety hidden danger were obtained.Finally,the text word vector of coal mine safety hidden danger was represented with ERNIE pre-training model,the context semantic features was extracted with BiGRU structure and the entity labels was decoded with CRF model,thus to complete the named entity recognition of coal mine safety hidden danger.The experimental results show that:the accuracy,recall and F1 value of ERNIE-BiGRU-CRF model on sequence labeling tasks are 56.69%,69.23%and 62.34%,respectively,which are 6.85%,13.74%and 9.83%higher than baseline model of BiLSTM-CRF.And there is little difference between the entity prediction results and the actual label results.In addition,it was verified by the ablation experiment that,BiGRU layer can better capture semantic dependency of text context for coal mine safety hidden danger and CRF layer can further optimize label sequence.
作者 刘飞翔 李泽荃 赵嘉良 李靖 LIU Feixiang;LI Zequan;ZHAO Jialiang;LI Jing(School of Mine Safety,North China Institute of Science and Technology,Beijing 065201,China;School of Economics and Management,North China Institute of Science and Technology,Beijing 065201,China;School of energy and mining,China University of Mining and Technology-Beijing,Beijing 100083,China)
出处 《煤炭工程》 北大核心 2024年第2期206-212,共7页 Coal Engineering
基金 中央高校基本科研业务费资助项目(3142017107) 廊坊市科技计划项目(2023029061)。
关键词 煤矿安全隐患 ERNIE-BiGRU-CRF算法模型 命名实体识别 信息抽取 coal mine safety hidden danger text ERNIE-BiGRU-CRF algorithm model named entity recognition information extraction
  • 相关文献

参考文献9

二级参考文献85

共引文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部