期刊文献+

基于文章标题信息的汉语自动文本分类 被引量:2

Chinese Automatic Text Categorization Based on Article Title Information
下载PDF
导出
摘要 文本分类是文本挖掘的一个重要组成部分,是信息搜索领域的一项重要研究课题。该文提出一种基于文章标题信息的汉语自动文本分类方法,在HNC理论的领域概念框架下,通过标题信息所蕴涵的领域信息词语激活对应的HNC领域,实现文本的自动分类。实验证明,该方法与采用SVM算法进行文本分类的方法比较,测试速度和分类平均准确率明显提高。 The text categorization is an important part of the text excavation, and it becomes the research topic in the information searching field. The paper proposes a method of the automatic text categorization which uses the article title information. The method based on the domain concept frame of the Hierarchical Network of Concepts(HNC) theory uses the domain concept of the tide information to activate the corresponding HNC domain, and realizes the text automatic categorization. The experiment proves that the method may effectively enhance the efficiency and the accurate rate of the text automatic categorization.
出处 《计算机工程》 CAS CSCD 北大核心 2008年第20期13-14,17,共3页 Computer Engineering
基金 国家“973”计划基金资助项目“自然语言理解的交互引擎研究”(2004CB318104) 中科院声学所知识创新工程基金资助项目“HNC语言知识处理理论及技术”
关键词 文本分类 HNC理论 领域 text categorization Hierarchical Network of Concepts(HNC) theory domain
  • 相关文献

参考文献2

二级参考文献12

  • 1[1]Yang Y. Effective and efficient learning form human decisions in text categorization and retrieval[M]. In: Proc of the seventeenth Int'l ACM.SIGIR Conf on Research and Development in Information Retrieval Dublin, 1994,13~22.
  • 2[2]Apte C, Damerau F, Weiss S. Automated learning of decision rules for text categorization[M]. ACM Transactions on Information System, 1994,12 (3):233~251.
  • 3[3]Lewis D D, Schapure R E, Callan J P. et al. Training algorithms for linear text classifiers. In: Proc of the Nineteenth Int'l ACM SIGIR Coif on Research and Development in Information Retrival[M]. Zurich, 1996,298~306.
  • 4[4]Cohen W W, Singer Y.Context-sensitive learning methods for text eategorization[M].In:Proc of the 19th Int'l ACM SIGIR Conf on Research and Development in Information Retrieval. Zurich, 1996,307.315.
  • 5[5]S.T. Dumais. et al. Using latent semantic analysis to improve information retrieval[C]. In CHI'88 Proceedings. 1988, 281~285.
  • 6Belur V Dasarathy.Nearest neighbor (NN) Norms:NN Pattern Classification Techniques[C].//IEEE Computer Society Press,Las Alamitos:California,1991.
  • 7Zhu Lanjuan.The Theory and Experiments on Automatic Chinese Documents Classification[J].Journal of the China Society for Scientific and Technical Information,1987,(6):90-111.
  • 8Quinlan J R.lnduction of Decision Trees[J].Machine Leaning,1986,(1):81-106.
  • 9Kwok T Y.Automatic Text Categorization Using Support Vector Machine[C].// Proc.Int.Conf.on Neural Information Processing,1998:347-351.
  • 10Salton G,Wong A,Yang C S.On the specification of term values in automatic indexing[J].Journal of Documentation,1973,29(4):351-372.

共引文献13

同被引文献23

引证文献2

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部