期刊文献+

面向配网一次设备缺陷文本命名实体识别研究 被引量:2

Research on Named Entity Recognition Technology for Defect Text of Primary Equipment in Distribution Network
原文传递
导出
摘要 配网系统存储着大量闲置的设备缺陷文本,可采用命名实体识别技术对其进行挖掘和利用。针对目前电力设备缺陷文本数据人工标注效率低,且专业领域实体识别困难的问题,提出一种新的标注策略和基于Bert-CRF(Bidirectional encoder representation from transformers-Conditional Random Fields)的命名实体识别模型。利用基于半监督学习的BIO(Begin、Internal、Other)标注,减少人工标注占比,提升标注速率,接着利用Bert预训练模型得到包含丰富语义信息的动态词向量,最后利用CRF层对标签进行约束。所提模型在自制配网一次设备缺陷文本数据集上进行了对比试验,该数据集包含9186条文本数据,12个大类25个小类。实验结果表明,文中模型取得了很好的效果,精确率、召回率和F1值分别达到97.85%、97.36%、97.34%,验证了该模型优于其他5种模型。 The distribution network system stores a large number of idle equipment defect analysis reports,which can be mined and utilized by named entity recognition technology.In view of the low efficiency of manual annotation of text data of electrical equipment defects and the difficulty of entity recognition in professional fields,this paper proposes a new annotation strategy and a named entity recognition model based on Bert-CRF(Bidirectional encoder representation from transformers-Conditional Random Fields).Use BIO(Begin,Internal,Other)annotation based on semi-supervised learning to reduce the proportion of manual annotation and improve the annotation rate,then use the Bert pre-training model to obtain dynamic word vectors containing rich semantic information,and finally use the CRF layer to constrain the labels.The proposed model is tested on the self-made distribution network primary equipment defect text dataset,which contains 9186 text data,12categories and 25subcategories.The experimental results show that the model in this paper has achieved good results,with the precision rate,recall rate and F1 value reaching 97.85%,97.36%,and 97.34%,respectively,verifying that the model is better than the other five models.
作者 刘雨可 周申培 石英 杜家宝 LIU Yu-ke;ZHOU Shen-pei;SHI Ying;DU Jia-bao(School of Automation,Wuhan University of Technology,Wuhan 430070,China)
出处 《武汉理工大学学报》 CAS 2022年第10期93-101,共9页 Journal of Wuhan University of Technology
基金 国家自然科学基金(52105528)
关键词 命名实体识别 缺陷文本 半监督学习 Bert-CRF named entity recognition defect text semi-supervised learning Bert-CRF
  • 相关文献

参考文献10

二级参考文献85

共引文献236

同被引文献37

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部