期刊文献+

基于Transformer和隐马尔科夫模型的中文命名实体识别方法 被引量:8

Chinese named entity recognition method based on Transformer and hidden Markov model
原文传递
导出
摘要 提出了一种基于Transformer和隐马尔科夫模型的字级别中文命名实体识别方法。本文改进了Transformer模型的位置编码计算函数,使修改后的位置编码函数能表达字符之间的相对位置信息和方向性。使用Transformer模型编码后的字符序列计算转移矩阵和发射矩阵,建立隐马尔科夫模型生成一组命名实体软标签。将隐马尔科夫模型生成的软标签带入到Bert-NER模型中,使用散度损失函数更新Bert-NER模型参数,输出最终的命名实体强标签,从而找出命名实体。经过对比实验,本文方法在中文CLUENER-2020数据集和Weibo数据集上,F_(1)值达到75.11%和68%,提升了中文命名实体识别的效果。 A new method for Chinese named entity recognition at word level based on transformer and hidden Markov model is proposed.The position coding calculation function of transformer model is improved,so that the modified position coding function can express the relative position information and directivity between characters.The character sequence encoded by transformer model is used to calculate the transfer matrix and emission matrix,and a hidden Markov model is established to generate a group of named entity soft labels.The soft label generated by hidden Markov model is brought into Bert-NER model,the divergence loss function is used to update the parameters of Bert-NER model,and the final named entity strong label is output to find the named entity.Through comparative experiments,the F_(1) value of the proposed method in Chinese cluster-2020 data set and Weibo data set reaches 75.11%and68%,which improves the effect of Chinese named entity recognition.
作者 李健 熊琦 胡雅婷 刘孔宇 LI Jian;XIONG Qi;HU Ya-ting;LIU Kong-yu(College of Information Technology,Jilin Agricultural University,Changchun 130118,China;Jilin Bioinformatics Research Center,Changchun 130118,China)
出处 《吉林大学学报(工学版)》 EI CAS CSCD 北大核心 2023年第5期1427-1434,共8页 Journal of Jilin University:Engineering and Technology Edition
基金 吉林省发改委产业技术研究与开发项目(2020C037-7) 吉林省科技发展计划项目(20230508026RC) 长春市科技发展计划项目(21ZGN26)。
关键词 人工智能 中文命名实体识别 隐马尔科夫模型 Transformer编码器 位置编码 artificial intelligence chinese named entity recognition HMM transformer encoder position coding
  • 相关文献

参考文献6

二级参考文献47

  • 1陈治纲,何丕廉,孙越恒,郑小慎.基于向量空间模型的文本分类系统的研究与实现[J].中文信息学报,2005,19(1):36-41. 被引量:43
  • 2张晓艳,王挺,陈火旺.命名实体识别研究[J].计算机科学,2005,32(4):44-48. 被引量:67
  • 3周俊生,戴新宇,尹存燕,陈家骏.基于层叠条件随机场模型的中文机构名自动识别[J].电子学报,2006,34(5):804-809. 被引量:112
  • 4洪铭材,张阔,唐杰,李涓子.基于条件随机场(CRFs)的中文词性标注方法[J].计算机科学,2006,33(10):148-151. 被引量:56
  • 5Volk Martin, Clematide Simon. Learn-filter-apply-forget mixed approaches to named entity recognition [C]. In: Proc of the 6th Int'l Workshop on Applications of Natural Language for Information Systems. Berlin: Springer, 2001. 153-163.
  • 6Y Z Wu, J Zhao, B Xu. Chinese named entity based on multiple features [C]. Human Language Technology Conference and Conf on Empirical Methods in Natural Language Processing (EMNLP-2005), Vancouver, Canada, 2005.
  • 7H P Zhang, Q Liu, H Zhang, et al. Automatic recognition of Chinese unknown words based on roles tagging [C]. SigHan2002 Workshop Attached with the 19th Int'l Conf on Computational Linguistics, Taipei, 2002.
  • 8O Bender, F J Och, H Ney. Maximum entropy models for named entity recognition [C]. The 7th Conf on Computational Natural Language Learning (CoNLL 2003), Edmonton, Canada, 2003.
  • 9H L Chieu, H T Ng. Named entity recognition with a maximum entropy approach [C]. The 7th Conf on Computational Natural Language Learning (CoNLL 2003), Edmonton, Canada, 2003.
  • 10A Berger, V J Della Pietra, S A Della Pietra. A maximum entropy approach to natural language processing [J]. Computational Linguistics, 1996, 22(1): 39-71.

共引文献118

同被引文献49

引证文献8

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部