摘要
随着Internet的飞速发展,网络上的web信息资源迅速膨胀,如何在浩瀚的web文本信息资源中高效精确地挖掘出有用的知识已经成为目前的研究热点之一。本文首先介绍了web文本分类的基本概念,并对分类过程中的几个关键技术做了简单描述,然后对传统的属性约简方法进行了改进,初步证明该改进可降低决策表的维数,从而减少运算量,提高了效率。
With the fast development of Internet, the Information Resource is expanding rapidly. So how to mine the useful knowledge effectively and exactly in large web text information resources have become one of the research hot sots. First, this paper introduces the basic concept of web text categorization and describes the key technique during the course of categorization. Then, the traditional way of reduction attributes is improved, and the improvement can decrease the dimension of the decision table. So it reduces the operations and improves the efficiency.
出处
《微计算机信息》
2009年第27期180-181,8,共3页
Control & Automation
基金
基金颁发部门:国家自然科学基金委(60863002)
关键词
WEB文本分类
ROUGH集
属性约简
决策表
web text categorization
Rough set
Reduction of attributes
Decision table