摘要
制造企业积累大量的零件加工经验多以文本形式存在,如何从文本中挖掘出高质量的零件加工知识是个尚待解决的问题。针对待识别实体存在的偏正结构特征,导致实体边界界定模糊的问题,提出一种多网络协调的中文命名实体识别方法。在BERT生成字向量的过程中,通过领域自适应方法,提高字向量对工艺实体的表征能力,同时,在BiLSTM-CRF模型中引入注意力机制和多门控制的混合专家网络捕获上下文特征与实体信息。实验表明,较于当前主流的命名实体识别模型,该文提出的方法对机械零件加工实体识别的F1值达到80.15%,取得优于其他模型的最好性能。
Manufacturing enterprises accumulate a large amount of part processing experience mostly in the form of text.How to extract high-quality processing knowledge from the text is a problem yet to be solved.In response to the problem of the subordinate structure entities to be recognized that leads to the ambiguity of entity boundary definition,a multi-network coordinated Chinese named entity recognition method was proposed.In the process of word vector generation by BERT,the characterization ability of word vectors for process entities was improved by domain self-adaptive methods,and at the same time,attention mechanism and hybrid expert network with multi-gate control were introduced in the BiLSTM-CRF model to capture contextual features and entity information.The experiments showed that the proposed method achieved the best performance over other models by achieving the F1 value of 80.15%for the recognition of machined entities of mechanical parts compared with the current mainstream named entity recognition models.
作者
王素琴
王钰珏
石敏
朱登明
李兆歆
WANG Suqin;WANG Yujue;SHI Min;ZHU Dengming;LI Zhaoxin(School of Control and Computer Engineering,North China Electric Power University,Beijing 102206,China;Agricultural Information Institute,Chinese Academy of Agricultural Sciences,Beijing,100081,China)
出处
《计算机集成制造系统》
EI
CSCD
北大核心
2024年第3期958-967,共10页
Computer Integrated Manufacturing Systems
基金
国家重点研发计划资助项目(2020YFB1710400)。
关键词
中文命名实体识别
机械零件加工
多门控制的混合专家网络
领域自适应
Chinese named entity recognition
manufacturing processes
hybrid expert network with multi-gate
domain self-adaptive