期刊文献+

多准则融合的中文命名实体识别方法 被引量:4

Chinese named entity recognition based on multi-criteria fusion
下载PDF
导出
摘要 为提高中文命名实体识别任务的识别率,提出了一种多准则融合模型.采用基于字的BERT语言模型作为语言信息特征提取层,将其接入多准则共享连接层和条件随机场(CRF)层,得到融合模型.建立大规模中文混合语料库,优化模型参数,使用单GPU设备完成BERT语言模型的预训练.将融合模型在MSRA-NER和RMRB-98-1实体标注集上进行独立训练和混合训练,得到各语料库独立的单准则中文命名实体识别模型和多准则融合中文命名实体识别模型.结果表明,多准则融合中文命名实体识别模型能够挖掘语料库间的共有信息,提高中文命名实体的识别率,MSRA-NER和RMRB-98-1实体标注集上的F1值分别为94.46%和94.32%,优于其他现有模型. To improve the recognition rate of Chinese named entity recognition tasks,a multi-criteria fusion model was proposed.The word-based BERT(bidirectional encoder representations from transformers)language model was used as the language information feature extraction layer,and connected to the multi-criteria shared connection layer and the conditional random field(CRF)layer to obtain the fusion model.Then,a large-scale Chinese mixed corpus was established and the model parameters were optimized.A single GPU(graphics processing unit)device was used to complete the pre-training of the BERT language model.Independent and hybrid training of the fusion model on MSRA-NER and RMRB-98-1 entity annotation sets were carried out to obtain the independent single-criteria Chinese named entity recognition model and the multi-criteria fusion Chinese named entity recognition model for each corpus.The results show that the multi-criteria fusion Chinese named entity recognition model can mine common information between corpora and improve the recognition rate of Chinese named entities.The F1 values on MSRA-NER and RMRB-98-1 entity tagging sets are 94.46%and 94.32%,respectively,which are better than those of other models.
作者 蔡庆 Cai Qing(Jiangsu Institute of Automation, Lianyungang 222061, China)
出处 《东南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2020年第5期929-934,共6页 Journal of Southeast University:Natural Science Edition
基金 “十三五”装备预研共用技术和领域基金资助项目(41412030902).
关键词 命名实体识别 BERT 条件随机场 多准则学习 named entity recognition bidirectional encoder representations from transformers(BERT) conditional random field(CRF) multi-criteria learning
  • 相关文献

参考文献2

二级参考文献24

  • 1Tjong K,Sang E F,De Meulder F.Introduction to the CoNLL-2003 shared task:Language-independent named entity recognition[C]∥Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4.Association for Computational Linguistics,2003:142-147.
  • 2McCallum A,Li W.Early results for named entity recognitionwith conditional random fields,feature induction and web-enhanced lexicons[C]∥Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4.Association for Computational Linguistics,2003:188-191.
  • 3Wang Zhi-qiang.Research on Chinese named entity recognition based on conditional random fields[D].Nanjing:Nanjing University of Science and Technology,2006(in Chinese).
  • 4Hinton G E,Salakhutdinov R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
  • 5Hinton G,Osindero S,Teh Y W.A fast learning algorithm for deep belief nets[J].Neural Computation,2006,18(7):1527-1554.
  • 6Nadeau D,Sekine S.A survey of named entity recognition and classification[J].Lingvisticae Investigationes,2007,30(1):3-26.
  • 7Hinton G E.Learning distributed representations of concepts[C]∥Proceedings of the Eighth Annual Conference of the Cognitive Science Cociety.1986,1:12.
  • 8Wang M,Manning C D.Effect of non-linear deep architecture in sequence labeling[C]∥Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP).2013.
  • 9Mansur M,Pei W,Chang B.Feature-based Neural LanguageModel and Chinese Word Segmentation[C]∥ International Joint Conference on Natural Language Processing.2013:1271-1277.
  • 10Bengio Y,Ducharme R,Vincent P,et al.A neural probabilistic language model[J].The Journal of Machine Learning Research,2003,3:1137-1155.

共引文献114

同被引文献45

引证文献4

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部