期刊文献+

融合迁移学习的中文命名实体识别 被引量:23

Research on Chinese Named Entity Recognition Fusing Transfer Learning
下载PDF
导出
摘要 命名实体识别是自然语言处理研究领域中的一项很重要的基础性任务,是实体关系抽取和事件抽取等高层任务重要基石.如何在缺乏标注语料或只有少量标注语料条件下,提高命名实体识别的性能是自然语言处理领域的一个重要研究方向.针对这一问题,提出一种基于实例的迁移学习算法——TLNER_AdaBoost.该方法通过自动调整训练集中实例样本的权重和计算辅助训练样本的迁移能力来提高训练语料质量,并选取采用不完全标注语料的自学习方法和采用完全标注语料的基于条件随机场的方法来对该方法进行实验对比分析.经实验对比分析得知,本文方法在提高命名实体识别的准确率、召回率和F值的同时,大大降低了人工标注语料的工作量. Named entity recognition is a very important basic task in natural language processing research;it is the high-level task cornerstone of the entity relationship extraction and event extraction. When we are faced with the lack of annotation corpus, or only a small amount of tagging corpus, how to improve the performance of named entity recognition was an important research direction in natural language processing. In order to solve this problem, we present an instance-based transfer learning algorithm TLNER_Ada- Boost. This method improved the quality of training corpus by adjusting automatically weight of the whole training instances samples and calculated the transfer ability of assist training samples, then improved the performance of named entity recognition. We use the self-learning method with incomplete tagging corpus and conditional random field method with completely tagging corpus as the comparison method to do experiment contrast analysis. The experiment results show that our method can not only improve the rate of accuracy, recall and F-Measure of named entity recognition, but also decrease the workload of manual tagged corpus.
出处 《小型微型计算机系统》 CSCD 北大核心 2017年第2期346-351,共6页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61462054 61363044)资助 云南省科技厅面上项目(2015FB135)资助 云南省教育厅科学研究基金重点项目(2014Z021)资助 昆明理工大学省级人培项目(KKSY201403028)资助
关键词 命名实体识别 迁移学习 机器学习 TLNER_Ada BOOST named entity recognition transfer learning machine learning TLNER AdaBoost
  • 相关文献

参考文献7

二级参考文献228

  • 1孙茂松,黄昌宁,高海燕,方捷.中文姓名的自动辨识[J].中文信息学报,1995,9(2):16-27. 被引量:87
  • 2王娟,慈林林,姚康泽.特征选择方法综述[J].计算机工程与科学,2005,27(12):68-71. 被引量:64
  • 3俞鸿魁,张华平,刘群,吕学强,施水才.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94. 被引量:151
  • 4周俊生,戴新宇,尹存燕,陈家骏.基于层叠条件随机场模型的中文机构名自动识别[J].电子学报,2006,34(5):804-809. 被引量:112
  • 5蒋龙,周明,简立峰.利用音译和网络挖掘翻译命名实体[J].中文信息学报,2007,21(1):23-29. 被引量:11
  • 6NIST. The ACE 2007 (ACE07) Evaluation Plan: Evaluation of the Detection and Recognition of ACE Entities, Values, Temporal Expressions, Relations, and Events [EB/OL]. [-2007]. http://www, hist. gov/ speech/tests/ace/2OOT/doc/aceOT-evalplan, vl. 3a. pdf.
  • 7Nancy A. Chinchor. Overview of MUC-7/MET-2[C]//Proceedings of the Seventh Message Under- standing Conference (MUC-7), Fairfax, Virginia, 1998.
  • 8Gina Anne Levow. The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition[C]//Proceedings of the Fifth SigHAN Workshop on Chinese Language Processing, Sydney: Association for Computational Lin- guistics, 2006:108 117.
  • 9A. Mikheev, C. Grover, Moens M. Description of the LTG System Used for MUC-7[C]//Proceedings of 7th Message Understanding Conference ( MUC-7 ), Fairfax, Virginia, 1998.
  • 10863计划中文信息处理与智能人机接口技术评测组.2004年度863计划中文信息处理与智能人机交互技术评测:命名实体评测结果报告[R].北京:863计划中文信息处理与智能人机接口技术评测组,2004.

共引文献603

同被引文献237

引证文献23

二级引证文献218

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部