期刊文献+

一种Web文档在线自适应分类新方法 被引量:1

A New Method of Online Adaptive Classification of Web Pages
下载PDF
导出
摘要 Web文档自动分类是Web挖掘中的重要研究内容。文档向量空间模型 (VSM)是实现文档自动分类的基础 ,但如何排除冗余属性并降低向量空间的维数是一个难点。文中运用粗集理论对由样本文档集合构成的信息系统进行数据泛化 ,并求取文档的最优规约属性集 ,大大降低了文档的特征空间的维数 ,减少了冗余属性对文档分类的干扰 ,提高了分类效率。运用FuzzyARTMAP(adaptiveresonancetheorymapping)神经网络 ,利用其自适应分类和增量学习的优良特性 。 The web documents classification is an important research content of web mining. Document vector space model is the foundation of automatic classification of documents, while it is difficult to eliminate redundant attributes and reduce the dimension of the vector space. The Rough Sets Theory is applied to generalize the information system comprised by document samples set, and to compute the best reducing properties set. So dimension of document feature space is reduced greatly, and disturbance to document classification is decreased too, which improve the efficiency of classification. In addition, using the advantage of adaptive classification and incremental learning of Fuzzy ARTMAP neural network, the online adaptive classification of web document is achieved.
出处 《重庆大学学报(自然科学版)》 EI CAS CSCD 北大核心 2003年第7期47-51,共5页 Journal of Chongqing University
关键词 网页分类 粗集 属性规约 在线自适应分类 WEB文档 web pages classification rough sets attributes reduction online adaptive classification
  • 相关文献

参考文献9

  • 1李晓黎,刘继敏,史忠植.基于支持向量机与无监督聚类相结合的中文网页分类器[J].计算机学报,2001,24(1):62-68. 被引量:108
  • 2KONT KANEN P, MYLLYMAKI P, SILANDER T, et al.BYDA: software for Bayesian classification and feature selection[A]. AGRAWAL R, STOLORZ P E, PIATETSKY- SHAPIRO G, eds. Processdings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD'98) [C]. Menlo Park: AAAI Press, 1998,254-258.
  • 3YANG Y. Expert network: Effective and efficient learning from human decisions in text categorization and retrieval[ A]. Proc .Seventeenth International ACM SIGIR Conference on Research and Developmentin Information Retrieval[ C ]. Dublin, 1994.
  • 4APTE C, DAMERAU F, WEISS S. Automated learning of decision rules for text categorization[ J]. ACM Transactions on Information System ,1994, 12 (3) : 233 - 251.
  • 5SALTON G, WONG YAND C S. A Vector space model for automatic indexing[ J]. Communications of ADC, 1975, 18(11) : 613-620.
  • 6SALTON G. Introduction to Modem Information Retrieval [M]. New York : Mc Graw - Hill Book Company, 1983.
  • 7PAWLAK Z. Rough Sets - Theoretical Aspects of Reasoning About Data[M]. Kluwer Academic Pub, 1991.
  • 8HAN J, FU Y. Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases[A].Proc. AAAI'94 Workshop on Knowledge Discovery in Database (KDD'94) [C]. 1994,157 -168.
  • 9CARPENTER G A. Fuzzy ARTMAP: A Neural Network Architecture for Incremental Supervised Learning of Analog Multidimensional Maps [ J ]. IEEE Trans. Neural networks,1992,3 (5) :698 -713.

二级参考文献1

共引文献107

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部