摘要
随着电力计量业务的不断扩展,迫切需要由业务信息、技术知识、行业标准及其内在联系所组成的电力计量知识图谱,为电网的决策和发展提供更为全面有效的支持。命名实体识别是构建知识图谱的基础。针对电力计量领域需要,结合中文分词技术特点,基于联合学习思想,提出了一种基于联合学习的中文电力计量命名实体识别技术。该技术联合CNN-BLSTM-CRF模型与整合词典知识的分词模型,使其共享实体类别和置信度;同时将2个模型的先后计算顺序改为并行计算,减少了识别误差累积。结果表明,在不需要人工构建特征的情况下,方法的正确率、召回率、F值等均显著优于以往方法。
While the business of electric power metering is expanding,it is urgent to build an electric power metering knowledge graph composed of business information,technical knowledge,industry standards and their internal connections to provide more comprehensive and effective support for the decision-making and development of power grid.Named entity recognition(NER)is a fundamental task for building knowledge graph.This paper proposes an entity recognition method based on a joint learning model which considers the feature of Chinese word segmentation and ideas of multi-task learning in the electric power metering domain.The neural CNN-BLSTM-CRF model and the Chinese word segmentation model with dictionary knowledge are jointly trained to build an unified named entity recognition model which shares the entity types and confidence and changes the computing order from serial to parallel to decrease the error accumulation.The experimental results show that the proposed method is obviously better than previous methods in precision,recall rate and F-score without the need of artificial feature construction.
作者
肖勇
郑楷洪
王鑫
钱斌
孙凌云
XIAO Yong;ZHENG Kaihong;WANG Xin;QIAN Bin;SUN Lingyun(Electric Power Research Institute,China Southern Power Grid,Guangzhou 510663,China;School of Computer Science and Technology,Zhejiang University,Hangzhou 310058,China)
出处
《浙江大学学报(理学版)》
CAS
CSCD
北大核心
2021年第3期321-330,共10页
Journal of Zhejiang University(Science Edition)
基金
南方电网公司科技项目(ZBKJXM20180157)
国家自然科学基金资助项目(U1866602,61772456,61672451)
浙江省重点研发计划项目(2019C03137)
科技创新2030重大项目(2018AAA0100703).
关键词
电力计量
联合学习
命名实体识别
分词
power metering
joint learning
named entity recognition
word segmentation