摘要
事故案例数据库中的大量事故信息为安全攸关系统的设计提供了丰富、宝贵的经验,包括事故发生的时间、地点、原因、经过等。这些信息在危险辨识中起着至关重要的作用,但它们通常分布在事故文档的各个段落中,使得人工提取的效率低且成本高。本文提出了一种基于BERT(Bidirectional Encoder Representations from Transformers)预训练模型的事故案例文本分类方法,可将事故案例文本分为ACCIDENT、CAUSE、CONSEQUENCE、RESPONSE这4类。此外,收集并构建了事故案例文本数据集用于训练模型。实验结果表明,本文方法可以实现对事故案例文本的自动分类,分类准确率达到73.44%,召回率为69.13%,F1值为0.71。
The large amount of accident information in the accident case database can provide rich and valuable experience for the design of safety related system,including time,location,cause,process of accidents,etc.These informations play an important role in hazard identification,but they are usually distributed in various paragraphs of accident documents,which makes manual extraction inefficient and costly.This paper proposes a text classification method for accident cases based on BERT pre-training model,which can classify accident case texts into four categories:ACCIDENT、CAUSE、CONSEQUENCE,and RESPONSE.In addition,a test dataset of accident cases is collected and produced for training the model.The experiment shows that this method can achieve the automatic classification of accident case text,with a classification accuracy of 73.44%,a recall rate of 69.13%,and an F1 value of 0.71.In this paper,multiple groups of different experimental parameters are set up,and the effect of parameter settings on classification is fully explored through experiments to find the best parameter settings.The proposed classification method can help better mine the semantic information in the accident case text and provide powerful technical support for the subsequent establishment of expert knowledge base and efficient accident retrieval platform.
作者
涂远来
周家乐
王慧锋
TU Yuanlai;ZHOU Jiale;WANG Huifeng(School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
出处
《华东理工大学学报(自然科学版)》
CAS
CSCD
北大核心
2023年第4期576-582,共7页
Journal of East China University of Science and Technology
基金
青年科学基金(61906068)
国家重点研发计划(2018YFC1803306)。
关键词
危险辨识
文本分类
BERT
需求分析
安全攸关系统
hazard identification
text classification
BERT
requirement analysis
safety critical system