期刊文献+

基于概念和关联扩充的文本标题分类机制

Mechanism for Title Classification Based on Conceptual and Associated Expansion
下载PDF
导出
摘要 文本分类是处理电子可读文本的重要手段,本文提出了基于标题的文本分类机制.其基本思想是:鉴于文本标题的重要性和简洁性,利用汉语语义分类树寻求概念上的扩充,利用语料库的关联矩阵,进行关联扩充,以丰富标题的语义内涵,从而获取较高精度的文本分类结果.该方法不依赖于汉语分析器和相应的领域知识库,速度较快,应用面较广. Text classification plays an important role in processing readable online texts. Text classification approach based on text titles is presented. Its main idea is shown as follows: considering the significance and concision of text titles, Concept expansion is performed with Chinese semantic classified tree; and association expansion is executed with the associated matrix derived from corpus. These expansions aim at enriching the meanings of text titles in synonymous and collocation relationships. The similarities between expanded feature vectors of classes and that of titles are used to determine the classes which texts belong to. It is independent of Chinese parser and domain knowledge bases, and it is easy to apply in wide range and its speed is fast.
出处 《小型微型计算机系统》 CSCD 北大核心 2005年第5期732-734,共3页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目 (60 3 73 0 95 )资助
关键词 文本分类 概念扩充 关联扩充 向量空间模型 text classification conceptual expansion associated expansion vector space model
  • 相关文献

参考文献8

  • 1战学刚,林鸿飞,姚天顺.中文文献的层次分类方法[J].中文信息学报,1999,13(6):20-25. 被引量:22
  • 2麻志毅.[D].东北大学,1999.
  • 3战学钢 姚天顺.基于语义分析的标题分类方法[A]..中文信息处理国际会议论文集[M].北京:清华大学出版社,1998..
  • 4陈磊.基于HNC语义分析的中文标题分类方法[A]..计算语言学文集[M].北京:清华大学出版社,1999.371-375.
  • 5Lin Hong-fei, Zhang xue-gang, Yao Tian-shun. Text structure analysis based on concept [J]. Journal of Computer Research and Development, 2000,37 (3): 324- 328.林鸿飞,战学钢,姚天顺.基于概念的文本分析方法[J].计算机研究与发展,2000,37(3):324-328.
  • 6Yang Yi-ming, et al. An example-based mapping method for text categorization and retrieve[J]. ACM Transaction on Information Systems, 1994,12(3) :257-277.
  • 7Elizabeth D. Liddy, etc. Text categorization for multiple users based on semantic features from a machine readable dictionary[J]. ACM Transaction on Information Systems, 1994,12 (3):278-295.
  • 8David D. Lewis. Challenge in machine learning for text classification[C]. In: Proceedings of the Ninth Annual Conference on Computational Learning Theory. Desenzano del Garda, Italy,1996 ,Http://www. research. att. com/~lewis.

二级参考文献5

  • 1战学刚 姚天顺.基于汉语分析的中文分类方法.1998中文信息处理国际会议论文集[M].北京:清华大学出版社,1998..
  • 2战学刚,1998中文信息处理国际会议论文集,1998年
  • 3吴立德,大规模中文文本处理,1997年
  • 4姚天顺,自然语言理解.一种让机器懂得人类语言的研究,1995年
  • 5Yang Yiming,http://www.cs.cmu.edu//yiming

共引文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部