期刊文献+

基于单字提示特征的中文命名实体识别快速算法 被引量:24

A Rapid Algorithm to Chinese Named Entity Recognition Based on Single Character Hints
下载PDF
导出
摘要 近年来条件随机场(CRF)模型在自然语言处理中的应用越来越广泛。标准的线性链(Linear-chain)模型一般采用L-BFGS参数估计方法,收敛速度慢。本文在分析模型复杂度的基础上提出了一种改进的快速CRF算法。该算法通过引入小规模单字特征降低特征的规模,并通过在推理过程中引入任务相关的人工知识压缩Viterbi和Baum-Welch格搜索空间,提高了训练的速度。在中文863命名实体识别评测语料和SIGHAN06语料集上进行的实验表明,该算法在不影响中文命名实体识别精度的同时,有效地降低了模型的训练代价。 Conditional Random Fields (CRF) model becomes prevalent for sequential labeling tasks in the field of NLP. A general but slow optimization algorithm L-BFGS is commonly used in parameter estimation of CRF Model. In this paper, an improved algorithm is proposed to train CRF model more quickly. First, small scale character hint features are introduced to decrease the feature space. Then, a task-specific rule is applied to reduce search paths in Viterbi and Baum-Welch procedure. The experiments on China 863 program NER and SIGHAN 2006 corpora show that our schema saves training time significantly without performance drop.
出处 《中文信息学报》 CSCD 北大核心 2008年第1期104-110,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60773027,60736044) 国家863计划重点资助项目(2006AA010108) 国家242计划资助项目(2006A40) 国家语委资助项目(MZ115-021)
关键词 计算机应用 中文信息处理 中文命名实体识别 条件随机场 自然语言处理 机器学习 computer application Chinese information processing Chinese named entity recognition conditionalrandom fields model natural language processing machine learning
  • 相关文献

参考文献24

  • 1张小衡,王玲玲.中文机构名称的识别与分析[J].中文信息学报,1997,11(4):21-32. 被引量:84
  • 2Tzong-Han Tsai,Shih-Hung Wu,Cheng-Wei Lee,et al.Mencius:A Chinese Named Entity Recognizer Using the Maximum Entropy-based Hybrid Model[A].International Journal of Computational Linguistics & Chinese Language Processing[C].2004,9(1):65-81.
  • 3刘非凡,赵军,吕碧波,徐波,于浩,夏迎炬.面向商务信息抽取的产品命名实体识别研究[J].中文信息学报,2006,20(1):7-13. 被引量:47
  • 4Youzheng Wu,Jun Zhao,Bo Xu,et al.Chinese Named Entity Recognition Based on Multiple Features[A].In:Proceedings of HLT/EMNLP[C].Vancouver:October 2005,427-434.
  • 5俞鸿魁,张华平,刘群,吕学强,施水才.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94. 被引量:160
  • 6Yuanyong Feng,Le Sun,Junlin Zhang.Early Results for Chinese Named Entity Recognition Using Conditional Random Fields Model,HMM and Maximum Entropy[A].IEEE Natural Language Processing & Knowledge Engineering[C].Beijing:Publishing House,BUPT,2005.549-552.
  • 7Andrew McCallum,Wei Li.Early Results for Named Entity Recognition with Conditional Random Fields,Feature Induction and Web-Enhanced Lexicons[A].Seventh ConNLL[C].Edmonton,Canada:2003.188-191.
  • 8Yuanyong Feng,Le Sun,Yuanhua Lv.Chinese Word Segmentation and Named Entity Recognition Based on Conditional Random Fields Models[A].The Third International Chinese Language Processing Bakeoff[C].Sydney,Australia:2006.181-184.
  • 9Junsheng Zhou,Liang He,Xinyu Dai,et al.Chinese Named Entity Recognition with a Multi-Phase Model[A].The Third International Chinese Language Processing Bakeoff[C].Sydney,Australia:2006.213-216.
  • 10Aitao Chen,Fuchun Peng,Roy Shan,et al.Chinese Named Entity Recognition with Conditional Probabilistic Models[A].The Third International Chinese Language Processing Bakeoff[C].Sydney,Australia:2006.173-176.

二级参考文献38

共引文献318

同被引文献285

引证文献24

二级引证文献434

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部