摘要
为了有效利用专业领域知识,对知识抽取展开研究,针对传统命名识别容易受上下文相对距离的影响、实体整体识别效果差的问题,提出了一种多头自注意力机制与条件随机场(CRF)结合的实体抽取模型。该模型基于Transformer架构,使用多头自注意力机制有效获取上下文信息,构建了多头自注意力机制与CRF结合的实体抽取改进模型,通过拼接多个自注意力机制,减少了对上下文中相对距离过多的约束并特征提取,实现了上下文全局信息的获取能力,提高了模型的泛化能力。最后,基于《人民日报》公开数据集,与其他机器学习模型进行了实验对比,验证了本方法的有效性,并在水下机器人任务作业数据的实体抽取中取得了较好的实验结果。
In order to effectively utilize professional domain knowledge,knowledge extraction was studied in this paper.Aiming at the problem that traditional naming recognition is easily affected by the relative distance of context and the overall entity recognition is poor,an entity extraction model based on multi-self-attention mechanism and conditional random field(CRF)was proposed.The model was based on the transformer structure,long since attention mechanism was used to effectively obtain the context information,constructs the bulls since the improved model of entity extraction attention mechanism combined with CRF,by joining together multiple since attention mechanism,reduces the context too relative distance constraints and feature extraction,and implements the context of the global information acquisition ability.The generalization ability of the model was improved.Finally,based on the People’s Daily public data set,the experimental comparison with other machine learning models verified the effectiveness of the proposed method,and achieved good experimental results in entity extraction of underwater robot task data.
作者
陈伟
吴云志
涂凌
刘航
余克健
乐毅
CHEN Wei;WU Yun-zhi;TU Ling;LIU Hang;YU Ke-jian;YUE Yi(School of Information and Computer,Anhui Agricultural University,Hefei,230036,Anhui;Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information,Hefei,230036,Anhui;School of Material Science and Information Technology,Anhui University,Hefei,230601,Anhui)
出处
《蚌埠学院学报》
2022年第5期54-60,共7页
Journal of Bengbu University
基金
青海省自然科学基金面上项目(2020-ZJ-913)
安徽省北斗精准农业信息工程实验室开放基金(BDSYS2021003)
安徽省现代农业产业技术体系专项经费(2021-2025)。
关键词
命名实体识别
多头自注意力机制
条件随机场
特征提取
named entity recognition
multiple self-attention mechanisms
conditional random field
feature extraction