摘要
提出BCBGAC(BERT-CNN-BiGRU-Attention-CRF)模型,通过在汉字嵌入中集成字形结构信息来提高中文命名实体识别精度.BCBGAC使用五笔法将汉字按书写顺序分解为基本汉字组件,汉字组件由Skip-Gram方法编码,汉字组件编码矩阵输入卷积神经网络CNN,提取汉字字形结构特征,生成汉字字形结构向量.字形结构向量与BERT模型生成的汉字基本向量拼接得到最终的汉字嵌入向量.然后将汉字向量输入到BiGRU网络中,以捕获向量之间的上下文关系.引入注意力机制来对字符向量进行加权,通过CRF解码层获得实体序列的最佳标注.在两个数据集上的实验结果表明,BCBGAC模型取得了比基线模型更好的实体识别效果.F 1在两个数据集上分别达到96.06%和95.48%,验证了BCBGAC模型在中文命名实体识别任务中的有效性.
This paper presents an integration model of BCBGAC,to improve the recognition accuracy of Chinese named entities by integrating the glyph structure information of Chinese characters in Chinese character coding.BCBGAC uses Wubi method to decompose Chinese characters into basic root components in writing order.The root components are encoded by Skip-Gram method.And the encodings of root components are input into CNN to generate the glyph structure vector of Chinese character by extracting the glyph structure features in the Chinese character.The glyph structure vector is spliced with the basic vector of Chinese character generated by BERT model to obtain the final Chinese character embedding vector.Then the Chinese character vectors are input into BiGRU network to capture the context-dependent relationship among the vectors.Attention mechanism is introduced to weight the vectors.A CRF decoding layer is used to obtain the best annotation of the entity sequence based on the Chinese characters vectors.Experimental results on two datasets show that BCBGAC model achieves better entity identification effect than the baseline model.The F1 value reaches 96.06%and 95.48%on the two datasets respectively,which verified the effectiveness of BCBGAC model in Chinese named entity recognition task.
作者
陈金玉
王名扬
刘旭
CHEN Jin-yu;WANG Ming-yang;LIU Xu(College of Computer and Control Engineering,Northeast Forestry University,Harbin 150036,China)
出处
《东北师大学报(自然科学版)》
CAS
北大核心
2024年第2期60-68,共9页
Journal of Northeast Normal University(Natural Science Edition)
基金
国家自然科学基金资助项目(71473034)
黑龙江省自然科学基金资助项目(LH2019G001).
关键词
命名实体识别
字形结构嵌入
BiGRU
注意力机制
Chinese named entity recognition
Glyph structure embedding
BiGRU
attention mechanism