摘要
在现有的Web文档分类器中,有的分类器产生比较精确的分类结果,有的分类器产生更易解释的分类模型,但还没有分类器可以将两个方面的优点结合起来。有鉴于此,论文提出一种基于关联规则的Web文档分类方法。该方法采用事务概念,主要考虑两方面的问题:①在文档训练集中发现最优的词条关联规则;②用这些规则构建一个Web文档分类器。试验表明该分类器性能良好,训练速度快,产生的规则易于被人理解,而且容易更新和调整。
Many classifieres for Web document classification are presented. Some of them are more accurate than others, and some provide more interpretable models, but none of them can combine the two beneficial properties. So a novel approach was presented for Web document classification based on association rules. Transaction model is adopted in the approach, and the approach mainly focuses on two problems: ①Finding the best term association rules; ②Using the rules to build a web document classifier. Experiments show that the classifier perform well and has a very fast training phase. The rules of the classifier generated are easy to understand, as well as to adjust and modify.
出处
《计算机工程与设计》
CSCD
北大核心
2005年第9期2515-2518,共4页
Computer Engineering and Design