期刊文献+

基于扩展概念格模型的文本分类规则提取的研究 被引量:3

Research on the Extracting Rules of Text Categorization Based on the Extended Concept Lattice Model
下载PDF
导出
摘要 文本分类是信息检索与数据挖掘领域的研究热点与核心技术,近年来得到了广泛的关注和快速的发展。概念格是规则提取和数据分析的有效工具,然而概念格的构造效率始终是概念格应用的一大难题。本文研究了基于扩展概念格模型的文本分类规则提取,利用粗糙集和扩展概念格模型来进行分类规则提取。该方法利用概念树,极大地除去了冗余的概念,只需要建造很少的概念就能够提取出全部的分类规则,不仅效率较高,而且同时提取的分类规则与概念格相同。本文算法在MATLAB7.0的环境中运行的实验表明,查全率比KNN算法和SVM算法稍低,但是查准率比它们都高,因此该分类规则用于文本分类时效果与KNN和SVM相当。 The technique of auto text categorization is the foundation in text mining, and text feature selection is the core of the text categorization. Concept lattice is a very effective method to extract rules and data analysis, however, its building efficiency is very low. This paper extracts the rules of the text categorization based on the extended concept lattices model, takes advantage of concept lattice in the categorization rule extracting which eliminates the useless concepts. This method can extract all rules by using a few concepts, which is efficient. This algorithm shows in the environment of running MAT-LAB7. 0 that the recall-precision is slightly lower than KNN and SVM , but precision ratio is higher than them. Therefore, if the classification rules are applied to text categorization, the categorization effect can be comparable with KNN and SVM.
作者 周顽 周才学
出处 《计算机工程与科学》 CSCD 北大核心 2010年第8期98-100,103,共4页 Computer Engineering & Science
关键词 文本分类 数据挖掘 粗糙集 概念格 分类规则 document eategorization data mining rough set,concept lattice categorization rule
  • 相关文献

参考文献7

  • 1Liu B, Hsu W, Ma Y M. Integrating Classification and Association Rule Mining [C]//Proc of the 4th Int'l Conf on Knowledge Discovery and Data Mining, 1998 : 80-86.
  • 2Will R. Restructuring Lattice Theory: An Approach Based on Hierarchies on Concepts, Ordered Sets Dordrecht[M]. Boston: Reidel, 1982.
  • 3张文修 吴伟志 梁吉业.粗糙集理论与方法[M].北京:科学出版社,2003.107-112.
  • 4Hu Xuc-Gang, Chen Hui. The Mining of Classification Rules Based on Multiple Extended Concept Lattice[C]//Proc of ICMLC'05,2005 : 18-21.
  • 5Dobole F, Sebastinai F. Supervised Term Weight for Automated Text Categorization[C]//Proc of the 18th ACM Symp on Applied Computing, 2003 : 784-788.
  • 6Lertnattee V, Theeramkong T. Effect of Term Distributions on Centroi&Based Text Categorization[J]. Information Sciences,2004,158(1) :89- 115.
  • 7Wang Hao,Yang Jing, Hu Xue-gang. A New CIassification Algorithm Based on Entropy and Relative Reduced Extended Concept Lattice[C]//Proc of ICMLC'04, 2004 :26-29.

共引文献103

同被引文献27

引证文献3

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部