期刊文献+

基于字词向量的BiLSTM-CRF水利工程巡检文本实体识别模型

Text Entity Recognition Model of BiLSTM-CRF Hydraulic Engineering Inspection Based on Word Vector
下载PDF
导出
摘要 命名实体识别是构建水利知识图谱的核心技术。水利工程巡检文本是水利工程最为常见的数据类型,以文本形式记录,没有固定格式与结构,但其包含水利工程安全潜在风险信息,具有价值密度高的特点。针对水利工程巡检文本命名实体识别问题,提出字词向量融合的BiLSTM-CRF模型,首先将巡检文本分别在字维度和词维度进行向量化处理,合并字词向量获取字词向量特征;然后利用BiLSTM神经网络获取序列化后的上下文特征;最后通过CRF进行解码并提取相应实体。以南水北调中线工程巡检文本为例,实验结果表明:字词向量结合之后的方法能有效提高识别性能,对实体边界的识别效果更优,模型准确率、召回率和F1值分别可以达到93.79%、93.06%、93.42%;时间效率较BERT-BiLSTM-CRF模型的时间效率提高82.86%。基于字词向量的BiLSTM-CRF模型可为水利工程知识图谱的快速构建提供技术支撑。 Named entity recognition is the core technology for constructing water resources knowledge graphs.Hydraulic en-gineering inspection text is the most common data type of hydraulic engineering.Recorded in text form,there is no fixed format and structure,but it contains potential risk information of water conservancy project safety,characterized by high value density.In view of the problem of recognizing named entities in the text of water conservancy project inspection,the BiLSTM-CRF model for word-vector fusion is proposed.Firstly,the inspection text is vectorized in word dimension and word dimension respectively,and word vector is combined to obtain word vector features.Secondly,BiLSTM neural net-work is applied to obtain the serialized contextual features.Finally,it is decoded by CRF and the corresponding entities are extracted.Taking the inspection text of the middle route of South-to-North Water Transfer project as an example,the exper-imental results show that the method combined with word vector can effectively improve the recognition performance.The recognition effect of the entity boundary works better,and the model accuracy,recall and F1 value can reach 93.79%,93.06%and 93.42%,respectively.The time efficiency is 82.86%better than that of the BERT-BiLSTM-CRF model.The BiLSTM-CRF model based on word vector can provide technical support for the rapid construction of hydraulic engineering knowledge graph.
作者 刘雪梅 程彭圣男 李海瑞 曹闯 高英 崔培 LIU Xuemei;CHENG Pengshengnan;LI Hairui;CAO Chuang;GAO Ying;CUI Pei(School of Information Engineering,North China University of Water Resources and Electric Power,Zhengzhou 450046,China;Henan Water&Power Engineering Consulting Co.,Ltd.,Zhengzhou 450016,China;School of Management and Economics,North China University of Water Resources and Electric Power,Zhengzhou 450046,China;Zhengzhou Yellow River Hydro Power Development General Company,Zhengzhou 450003,China)
出处 《华北水利水电大学学报(自然科学版)》 北大核心 2024年第3期9-17,共9页 Journal of North China University of Water Resources and Electric Power:Natural Science Edition
基金 国家自然科学基金项目(72271091) 河南省科学院科技开放合作项目(220901008)。
关键词 巡检文本 实体识别 双向长短期记忆神经网络 Word2Vec 条件向量场 inspection text entity recognition BiLSTM neural network Word2Vec conditional vector field
  • 相关文献

参考文献14

二级参考文献157

共引文献98

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部