期刊文献+

面向商务信息抽取的产品命名实体识别研究 被引量:47

Study on Product Named Entity Recognition for Business Information Extraction
下载PDF
导出
摘要 市场信息化使得商务信息抽取、市场内容管理日益成为信息科学领域的一个研究热点。产品命名实体识别作为其中非常重要的关键技术之一也逐渐受到人们的关注。本文面向商务信息抽取对产品命名实体进行了定义并系统分析了其识别任务的特点和难点,提出了一种基于层级隐马尔可夫模型(hierarchical hid-den Markov model)的产品命名实体识别方法,实现了汉语自由文本中产品命名实体识别和标注的原型系统。实验表明,该系统在电子数码和手机领域均取得了令人满意的实验结果,对产品名实体、产品型号实体、产品品牌实体整体识别性能的F值分别为79.7%,86.9%,75.8%。通过和最大熵模型相比较,验证了HHMM对于处理多尺度嵌套序列有更强的表征能力。 Electronic business has fueled increasing research interest recently in business information extraction and market intelligence management. As one of the key techniques, product named entity recognition ( product NER) has also begun to draw more attention in the field of natural language processing. In the paper, characteristics and challenges in product NER are explored and analyzed deliberately, and a hierarchical hidden Markov model (HHMM) based approach to product NER from Chinese free text is presented. Experimental results in both digital and mobile phone domains show that our approach performs quite well in these two different domains and achieves F-measures of 79.7%, 86.9%, 75.8% on the whole for three types of product named entities respectively. In comparison with maximum entropy model, HHMM is experimentally proved to be more powerful for dealing with multi-scale embedded sequence problem.
出处 《中文信息学报》 CSCD 北大核心 2006年第1期7-13,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60372016) 北京市自然科学基金资助项目(4052027)
关键词 计算机应用 中文信息处理 产品命名实体识别 商务信息抽取 层级隐马尔可夫模型 computer application Chinese information processing product named entity recognition business information extraction hierarchical hidden Markov model(HHMM)
  • 相关文献

参考文献9

  • 1John M.Pierre. Mining Knowledge from Text Collections Using Automatically Generated Metadata [A]. In: Proceedings of Fourth International Conference on Practical Aspects of Knowledge Management [C].London, UK: Springer-Verlag, 2002, 537- 548.
  • 2Bick, Eekhard. A Named Entity Recognizer for Danish[A]. In:IAno et al. (eds.), Proc. of 4th International Conf.on Language Resources and Evaluation(LRE2004)[C], Lisbon, 2004, 305-308.
  • 3Jian Sun, Jianfeng Gao, Lei Zhang, Ming Zhou, Changning Huang. Chinese Named Entity Identification Using Class-based Language Model [A]. In:Proceedings of the 19th international conference on Computational Linguistics[C]. Morristown, NJ, USA, Association for Computational Linguistics, 2002, 1 - 7.
  • 4Huaping Zhang, et al. Chinese NER Using Role Model [J]. Special Issue of the International Journal of Computational Linguistics and Chinese Language Processing, 2O03, 8(2):29 - 60.
  • 5Guohong Fu and Kang-Kwong Lake. Chinese Unknown Word Identification Using Clags-based LM[A]. In:Proceedings of the First International JointConference on Natural Language Processing (IJCNLP- 04) [C]. Hainan, China,2004, 262-269.
  • 6Tzong-Han Tsai, et al. Mencius: A Chinese Named Entity Recognizer Using the Maximum Entropy-based Hybrid Model [J]. International Journal of Computational Linguistics & Chinese Language Processing, 2004, 9(1):62- 82.
  • 7Cheng Niu, Wei Li, Jihong Ding and Rohini K. Srihari. A Bootstrapping Approach to Named Entity Classification Using Successive Learners [A]. In: Proceedings of the 41st ACL [C], Sappom, Japan, 2003, 335- 342.
  • 8Shai Fine, Yoram Singer, Naftali Tishby. (1998) The Hierarchical Hidden Markov Model: Analysis and Applications[J]. btachine Learning. 1998, 32(1): 41-62.
  • 9Y. Z. Wu, J. Zhao, B. Xu. Chinese Named Entity Recognition Combining Statistical Model with Human Knowledge[A]. Workshop of 41st ACL: nuhilingual and Mix-language NER[C], Sapporo, Japan, 2003, 65 - 72.

同被引文献431

引证文献47

二级引证文献495

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部