摘要
随着地铁的快速建设和隐患排查系统的建立,系统中积累了大量隐患排查记录,但是隐患排查记录信息冗杂,相关工作严重依赖导则与专家经验,需要投入大量人力成本。为提高隐患排查工作效率和安全管理决策,同时促进排查工作实现全程自动化,本文提出了一种基于文本挖掘与可视化技术的自动化分析隐患排查文本框架,该框架主要包括以下四个步骤:第一,基于Term Frequency-Inverse Document Frequency(TF-IDF)算法,对隐患描述下的关键词有一个整体的概括;第二,基于TF-IDF筛出特征值较高的关键词,借助吉布斯抽样的Latent Dirichlet Allocation(LDA)模型识别出大规模隐患描述语料库中潜藏的主题信息和隐患排查要点;第三,结合时间维度,通过Word Cloud(WC)技术对隐患描述进行可视化分析,绘制隐患词云演化图;第四,借助Word Co-occurrence Network(WCN)模型,挖掘隐患共现关系。该框架在分析武汉地铁2016-2018年施工安全隐患排查记录中得到了应用和验证。实验结果表明,该框架有效挖掘出34类隐患所对应的隐患排查要点和可视化信息。
With the rapid construction of metro and the establishment of hazard troubleshooting system,a large number of hazard records during the construction are saved in the system.However,there are many hazard records abundant in information,and they need to be analyzed on the basis of guidelines and expert experience seriously,requiring significant labor cost.In order to improve the efficiency of hazard troubleshooting process and the level of safety management decision-making,this paper presents a novel framework that combines text mining and visualization technologies,providing the ability to analyze hazard records automatically.The framework comprises the following four-step modelling approach.Firstly,an overview of hazard records is provided through the quantitative analysis by TF-IDF technology of keywords.Secondly,the thematic information and key points hidden in the large-scale hazard troubleshooting corpus are identified using a Latent Dirichlet Allocation algorithm.Thirdly,a visual overview of hazard records is generated through the quantitative analysis by Word Cloud technology of keywords.Finally,a Word Co-occurrence Network is produced to determine the interrelations between hazard categories and sites.The framework has been used and verified in the analysis of hazard troubleshooting records of Wuhan metro in 2016-2018,showing it can mine 34 categories of hazard troubleshooting keys and visual information.
作者
潘杏
钟波涛
黑永健
骆汉宾
Pan Xing;Zhong Botao;Hei Yongjian;Luo Hanbin(School of Civil and Hydraulic Engineering,Huazhong University of Science and Technology,Wuhan 430074,China)
出处
《土木建筑工程信息技术》
2021年第2期7-14,共8页
Journal of Information Technology in Civil Engineering and Architecture
基金
国家自然科学基金“数字建造模式下的工程项目管理理论与方法研究”(编号:71732001)
国家自然科学基金“文本与视频数据双重驱动的施工现场安全隐患智能诊控机理及其关键技术研究”(编号:51878311)。