期刊文献+

基于层叠条件随机场的旅游领域命名实体识别 被引量:37

Named Entity Recognition for the Tourism Domain Based on Cascaded Conditional Random Fields
下载PDF
导出
摘要 针对旅游领域,提出了一种基于层叠条件随机场模型的旅游领域命名实体识别方法。该方法在低层条件随机场中以字为切分粒度,结合旅游景点常用字表、景点常用后缀表、地名常用字表等特征词典,实现简单旅游命名实体的识别;其识别结果传递到高层模型,以词为切分粒度,结合复杂特征,实现嵌套景点、特产风味、地点的识别。最后进行了两组相关实验,结果表明,在开放测试中,层叠条件随机场模型相比于单层模型,F值提高了8个百分点;相比于HMM模型,正确率提高了8个百分点,召回率提高了22个百分点,F值提高了15个百分点。 This paper presents a method for named entity recognition in the tourism domain based on the cascaded conditional random fields. This method consists of two steps. The first step is used to identify simple tourism named entities, using Chinese characters as units with the dictionary of common character and suffix in tourism attractions, the dictionary of common character in location names and other dictionaries. Then the results of the first step are sent to the second step, in which the nesting tourist attractions, special snacks and location names are recognized by the word unit and other complex features. The results of six experiments indicated that in open testing, the proposed method increases by 8% in the F-score compared to the model of single layer, and by 15% in the F-score (with 8% in the precision and 22% in recall, respectively) compared to the HMM model.
出处 《中文信息学报》 CSCD 北大核心 2009年第5期47-52,共6页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60863011 60663004) 教育部博士点基金资助项目(20050007023) 云南省中青年学术带头人后备人才基金资助项目(2007PY01-11) 云南省教育厅重点基金资助项目(07Z11139) 昆明理工大学博士基金资助项目(2006-12)
关键词 计算机应用 中文信息处理 旅游领域 命名实体识别 层叠条件随机场 特征模板 computer application Chinese information processing tourism domain named entity recognition cascaded conditional random fields feature template
  • 相关文献

参考文献15

二级参考文献94

共引文献363

同被引文献424

引证文献37

二级引证文献439

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部