期刊文献+

基于关联规则的Web文档分类 被引量:8

Classifying web document based on association rules
下载PDF
导出
摘要 在现有的Web文档分类器中,有的分类器产生比较精确的分类结果,有的分类器产生更易解释的分类模型,但还没有分类器可以将两个方面的优点结合起来。有鉴于此,论文提出一种基于关联规则的Web文档分类方法。该方法采用事务概念,主要考虑两方面的问题:①在文档训练集中发现最优的词条关联规则;②用这些规则构建一个Web文档分类器。试验表明该分类器性能良好,训练速度快,产生的规则易于被人理解,而且容易更新和调整。 Many classifieres for Web document classification are presented. Some of them are more accurate than others, and some provide more interpretable models, but none of them can combine the two beneficial properties. So a novel approach was presented for Web document classification based on association rules. Transaction model is adopted in the approach, and the approach mainly focuses on two problems: ①Finding the best term association rules; ②Using the rules to build a web document classifier. Experiments show that the classifier perform well and has a very fast training phase. The rules of the classifier generated are easy to understand, as well as to adjust and modify.
出处 《计算机工程与设计》 CSCD 北大核心 2005年第9期2515-2518,共4页 Computer Engineering and Design
关键词 WEB文档分类 文本分类 关联规则 web mining text classification association rules
  • 相关文献

参考文献10

  • 1单松巍,冯是聪,李晓明.几种典型特征选取方法在中文网页分类上的效果比较[J].计算机工程与应用,2003,39(22):146-148. 被引量:76
  • 2Ch. Cherif Latiri, BenYahia S. Generating implicit association rules from textual data[C]. Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications,2001. 137-143.
  • 3Liu B, Hsu W, Ma Y. Integrating classification and Association rule mining [C]. Proc of the int confon knowledge discovery and data mining[C]. New York:AAAI Press, 1998.80-86.
  • 4Li Wen-min, Han Jia-wei, Pei Jian. CMAR:Aaccurate and efficient classification based on multiple class association rules [C].California:Morgan Kaufmann, 2001. 369-376.
  • 5邹晓峰,陆建江,宋自林.基于模糊分类关联规则的分类系统[J].计算机研究与发展,2003,40(5):651-656. 被引量:19
  • 6John D Holt, Soon M Chung. Mining association rules in text databased using multipass with inverted hashing and pruning[C].USA:IEEE Computer Society, 2002. 49-56.
  • 7Han Jia-wei, Micheline Kambr. Data mining:concepts and techniques [M].California:Morgan Kaufmann Publishers, 2000.
  • 8Baralis E, Garza P. A lazy approach to pruning classification rules[C]. USA:IEEE Computer Society, 2002.35-42.
  • 9Agrawal R, Srikant R. Fast algorithms for mining association rules [C]. California:Morgan Kaufmann, 1994.487-499.
  • 10张晓辉,何耀东,万家华,赵宏.关联规则发现的一种改进算法[J].东北大学学报(自然科学版),2001,22(4):401-404. 被引量:9

二级参考文献23

  • 1冯是聪 单松巍 张志刚 等.一个中文网页数据集及其分类体系[A]..海峡两岸技术交流会[C].南京,2002-10.121-129.
  • 2[1]Fayyad U M,Piatetsky-Shapiro G, Smyth P. From data mining to knowledge discovery:An overview[M]. Fayysd UM,Piatetsky-Shapiro G. Advances in Knowledge Discovery and Data Mining.1-35.
  • 3[2]Brachman R J,Fnand T. The process of knowledge discovery in d atabases:a human-centered approach[M]. Fayysd UM,Piatetsky-Shapiro G. Advanc es in Knowledge Discovery and Data Mining,37-58.
  • 4[3]Agrawal R,Imielinski T,Swami A. Mining association rules b etween sets of items in large database[A]. Proceedings of ACM SIGOD Conference on Management of Data[C]. Washington DC,1993.207-216.
  • 5[4]Agrawal R,Srikant R.Fast algorithms for mining association ru les in large databases[A]. Proceedings of the 20th International Conferenc e on Very Large Databases[C]. Santiago,Chile,1994.
  • 6[5]Houtsma M,Swami A.Set-oriented mining of association rules[ R]. Research Report RJ 9567. San Jose:IBM Almaden Research Center,1993.
  • 7[6]Strikant R,Agrawal R. Mining quantitative association rul es in large relational tables[A]. Proceedings of ACM SIGMOD Conference on Mana gement of Data(SIGMOD'96)[C]. Montreal,1996.1-12.
  • 8[7]Strikant R,Agrawal R.Mining generalized association rules[A ]. Proceedings of the 21st International Conference on Very Large Databases[C ]. Zurich,1995.407-419.
  • 9B Lent, A Swami, J Widom. Clustering association rules. In:AlexGray, Per-Ake Larson eds. Proc of the 13th Int'l Conf on Data Engineering. Birmingham, England: IEEE Computer Society, 1997. 220-231.
  • 10B Liu, W Hsu, Y Ma. Integrating classification and association rule mining. In: R Agrawal, P E Stolorz, G Piatetsky-Shapiro eds. Proc of the 4th Int'l Conf on Knowledge Discovery and Data Mining. New York: AAAI Press, 1998. 80-86.

共引文献101

同被引文献52

引证文献8

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部