期刊文献+

面向知识图谱构建的设备故障文本实体识别方法 被引量:11

Entity Recognition Approach of Equipment Failure Text for Knowledge Graph Construction
下载PDF
导出
摘要 电力设备在运行维护中积累了大量包含重要实体信息的故障文本,然而文本实体边界模糊、术语较多等特点导致传统实体识别方法训练效率低下,效果难以提升。为此,该文提出一种新的实体识别方法I-BRC(integrated algorithm of BERT based BiRNN with CRF)。该方法采用字嵌入模型将文本逐字转化为字向量序列以避免分词处理带来的误差累积;利用循环神经网络与概率图模型对文本的序列特征信息进行抽取;集成多个单一类型实体识别器分别独立学习不同类型实体的特征并采用并行预训练机制提升算法训练效率;最后利用多类型识别器对识别结果进行整合。此外,通过调整单一类型实体识别器可以灵活机动地应对不同电力设备的实体识别任务,避免重复训练,节省计算资源。实验表明,所提出的I-BRC仅需3次迭代就可收敛,训练效率大幅度提升;且该模型的F1值、精确率、召回率分别达到了88.0%、86.8%与89.2%,相比传统模型性能提升了7.5%~29.3%,验证了所提模型的有效性与可行性。 Technicians have accumulated plenty of failure texts,which contain essential entity information,during the operation and maintenance of power equipment.However,such text has fuzzy entity boundaries and contains many professional terms,resulting in the traditional entity recognition methods with low training efficiency and poor performances.Therefore,an integrated algorithm of BERT based BiRNN with CRF(I-BRC)is proposed.This algorithm employs a word embedding model to convert each word in the text into the embedding vector sequences to avoid the error accumulation caused by word segmentation.The recurrent neural networks with probability graph models are introduced to extract sequence features from the text.The multiple single-type entity recognizers are integrated to learn the features of different entity types independently,and a parallel pre-training mechanism is employed to improve the training efficiency.Finally,the recognition results are integrated by the multi-type recognizer.Besides,adjusting the single-type entity recognizers can flexibly respond to different power equipment failure texts,avoiding repeated training and saving computation resources.Experiments show that the proposed algorithm reached a stable state after 3 iterations,which significantly improves the training efficiency with its F1 score,precision and recall as 88.0%、86.8%and 89.2%respectively.Compared with the traditional models,the performance is improved by 19.5%to 28.8%,which verifies the effectiveness and feasibility of the proposed model.
作者 田嘉鹏 宋辉 陈立帆 盛戈皞 江秀臣 TIAN Jiapeng;SONG Hui;CHEN Lifan;SHENG Gehao;JIANG Xiuchen(School of Electronic Information and Electrical Engineering,Shanghai Jiao Tong University,Minhang District,Shanghai 200240,China)
出处 《电网技术》 EI CSCD 北大核心 2022年第10期3913-3922,共10页 Power System Technology
基金 国家重点研发计划项目(2020YFB1709701) 国家电网有限公司科技项目(5700-202119174A-0-0-00)。
关键词 电力设备 故障案例 中文实体识别 知识图谱 神经网络 power equipment failure case Chinese entity recognition knowledge graph neural network
  • 相关文献

参考文献12

二级参考文献153

共引文献366

同被引文献181

引证文献11

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部