摘要
随着中文命名实体识别研究的不断深入,大多数模型关注融入词汇或字形信息来丰富特征表示,但是却忽略了标签信息。因此文中提出了一种融合标签信息的中文命名实体识别模型。首先,通过预训练模型BERT-wwm得到字符的嵌入表示,并将标签向量化,使用Transformer解码器结构将字符表示与标签表示进行交互学习,捕捉字符与标签的相互依赖关系,丰富字符的特征表示。为了促进标签信息的学习,构建了基于文本句的监督信号,增加了多标签文本分类任务,采用多任务学习的方式进行训练。其中,命名实体识别任务采用条件随机场进行解码预测,多标签文本分类任务采用双仿射机制进行解码预测,两任务共享除解码层以外的所有参数,保证了不同的监督信息反馈到每个子任务。在公开数据集MSRA,Weibo和Resume上进行了多组对比实验,分别获得了95.75%,72.17%,96.23%的F1值。与多个基准模型相比,所提模型的实验效果有一定的提升,证明了该模型的有效性与可行性。
With the development of Chinese named entity recognition research,most models focus on enriching feature representation by integrating vocabulary or glyph information but ignore label information.Therefore,a Chinese named entity recognition model integrating label information is proposed in this paper.Firstly,the embedding representation of characters is obtained by pre-trained model BERT-wwm,and labels are represented as vectors.The character representation and label representation are interactively learned by using the Transformer decoder structure to capture the interdependence between characters and labels and enrich the feature representation of characters.To promote the learning of label information,a supervision signal based on text sentences is constructed,multi-label text classification tasks are added,and multi-task learning is used for training.Among them,the named entity recognition task uses a conditional random field for decoding and prediction,and the multi-label text classification task uses a biaffine mechanism for decoding and prediction.The two tasks share all parameters except the decoding layer,which ensures that different supervision information is fed back to each subtask.Several groups of comparative experiments are carried out on the public data sets MSRA,Weibo,and Resume,and the F1 values of 95.75%,72.17%,and 96.23%are obtained respectively.Compared with several benchmark models,experimental result of the proposed model is improved to some extent,which validates its effectiveness and feasibility.
作者
廖梦
贾真
李天瑞
LIAO Meng;JIA Zhen;LI Tianrui(School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China;Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province,Chengdu 611756,China;National Engineering Laboratory of Integrated Transportation Big Data Application Technology,Chengdu 611756,China)
出处
《计算机科学》
CSCD
北大核心
2024年第3期198-204,共7页
Computer Science
基金
国家自然科学基金面上项目(62176221)。
关键词
命名实体识别
标签信息
注意力机制
双仿射机制
预训练模型
Named entity recognition
Label information
Attention mechanism
Biaffine mechanism
Pre-trained model