摘要
为解决因城镇燃气事故调查报告标注样本缺乏,从而影响命名实体识别性能这一问题,提出基于BiLSTM-CRF+强化学习的燃气事故领域命名实体识别方法。首先在数据预处理阶段,采用基于文本结构的主旨段落抽取方法,识别事故调查报告的关键段落;其次在模型训练阶段,采用BiLSTM-CRF+强化学习模型,实现城镇燃气事故命名实体识别模型训练;最后利用城镇燃气事故调查报告作为试验数据进行验证。研究结果表明:经由强化学习模型降噪后,实体识别模型的综合评价指标提高5.76%,主旨段落识别方法相比Word2vec特征表示方法,使模型的综合评价指标提升7.17%。
In order to solve the problem that the lack of marked samples of the urban gas accident investigation reports affect the performance of named entity recognition,a named entity recognition method of gas accident field based on bidirectional long short term memory/conditional random fields(BiLSTM-CRF)and reinforcement learning was proposed.Firstly,in the data pre-processing stage,the theme paragraph extraction method based on the text structure was adopted to identify the key paragraphs of accident investigation reports.Secondly,in the model training stage,the BiLSTM-CRFand reinforcement learning model were used to train the named entityrecognition model of urban gas accidents.Finally,the urban gas accident investigation reports were taken as the test data for experimental validation.The results showed that the comprehensive evaluation index of the entity recognition model improved by 5.76%after the noise reduction by the reinforcement learning model,and the themeparagraph recognition method could improve the comprehensive evaluation index of the model by 7.17%compared with the Word2vec feature representation method.
作者
王明达
张榜
吴志生
李云飞
WANG Mingda;ZHANG Bang;WU Zhisheng;LI Yunfei(College of Mechanical and Electrical Engineering,China University of Petroleum(East China),Qingdao Shandong 266580,China)
出处
《中国安全生产科学技术》
CAS
CSCD
北大核心
2023年第3期39-45,共7页
Journal of Safety Science and Technology
关键词
城镇燃气事故
命名实体识别
信息抽取
强化学习
urban gas accident
named entity recognition
information extraction
reinforcement learning