期刊文献+

条件随机场与多层算法模型的实体自动识别 被引量:3

Automatic entity identification based on CRF and multilevel algorithm model
下载PDF
导出
摘要 实体自动识别技术是人们获取信息的有力手段,也是自然语言处理研究的关键技术之一。目前命名实体识别的研究较多,且已趋于成熟,而对汉语文本中的其他实体(名词性、代词性)研究较少。因此提出了一体化识别命名实体识别和名词性实体的方法,该方法将实体的汉字、分词、词性标注等信息引入条件随机场;再利用多层算法模型优化已经识别出的实体,以及召回未识别出的实体。在标准ACE语料库上进行实验,正确率达到75.56%,召回率达到72.52%。结果表明该方法对于实体识别问题是有效的。 Automatic entity identification technology is a powerful means to get information, and also is one of the key technologies in NLP field. Most of the current researches are named entity identification, and the researches are nearly mature,but the research of other kinds of entity like nominal and pronominal entity mentions is little. A method to identify the named and nominal entity mentions automatically is proposed. An approach for a new means, which using probability features inside the Chinese character, segmentation, and POS tagging information about it into CRF, then multilevel algorithm model to improve the results and recall which is not identified, is revealed to identify the entity in the corpus. Evaluated experiments on ACE standard corpus are proposed that the accuracy is 75.56%, and the recall is 72.52%. The results prove that the method is effective in entity identification problem.
出处 《计算机工程与应用》 CSCD 北大核心 2016年第11期141-147,共7页 Computer Engineering and Applications
基金 国家自然科学基金(No.61271304) 北京市教委科技发展计划重点项目暨北京市自然科学基金B类重点项目(No.KZ201311232037) 北京市属高等学校创新团队建设与教师职业发展计划项目(No.IDHT20130519)
关键词 实体识别 条件随机场 分词 多层算法模型 entity identification conditional random field segmentation multilevel algorithm model
  • 相关文献

参考文献15

  • 1Doddington G R,Mitchell A,Przybocki M A,et al.The Automatic Content Extraction(ACE)program-tasks,data,and evaluation[C]//Proceedings of the 4th International Conference on Language Resources and Evaluation,Lisbon,Portugal,2004.
  • 2张小衡,王玲玲.中文机构名称的识别与分析[J].中文信息学报,1997,11(4):21-32. 被引量:84
  • 3Chopra D,Morwal S.Named entity recognition in English using hidden Markov model[J].International Journal,2013.
  • 4Ekbal A,Bandyopadhyay S.Named entity recognition using support vector machine:a language independent approach[J].International Journal of Electrical,Computer,and Systems Engineering,2010,4(2):155-170.
  • 5Attardi G,Baronti L,Dei Rossi S,et al.Super Sense tagging with a maximum entropy Markov model[M]//Evaluation of natural language and speech tools for Italian.Berlin/Heidelberg:Springer,2013:186-194.
  • 6邱泉清,苗夺谦,张志飞.中文微博命名实体识别[J].计算机科学,2013,40(6):196-198. 被引量:33
  • 7黄德根,李泽中,万如.基于SVM和CRF的双层模型中文机构名识别[J].大连理工大学学报,2010,50(5):782-787. 被引量:13
  • 8Yao X.A method of Chinese organization named entities recognition based on statistical word frequency,part of speech and length[C]//2011 4th IEEE International Conference on Broadband Network and Multimedia Technology(IC-BNMT),Shenzhen,China,2011:637-641.
  • 9Che W,Wang M,Manning C D,et al.Named entity recognition with bilingual constraints[C]//Proceedings of NAACLHLT,Atlanta,USA,2013:52-62.
  • 10Ling Y,Yang J,He L.Chinese organization name recognition based on multiple features[M]//Intelligence and security informatics.Berlin/Heidelberg:Springer,2012:136-144.

二级参考文献56

共引文献147

同被引文献62

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部