摘要
由频繁项集产生的关联规则往往不能保证规则前、后件中的项是正相关的,因此可能产生无意义的关联规则;当这些关联规则用于分类时,会产生大量无用分类规则,增加了时间开销.因此,基于数学期望提出了正相关的频繁项集的分类算法.该算法在挖掘正相关频繁项集时,利用置信度进行规则选取,生成正相关关联规则组成的分类器对数据集进行分类.实验表明,这种分类算法可以大幅度减少所产生的频繁项集数量,分类准确率达到C4.5和CMAR的水平,且显著减少了算法的时间.
The association rules from frequent itemsets can not ensure items positive correlation in the antecedent or the consequent of a rule ,the association rule has no meaning, and these rules are used to classify, some no use classification rules will be produced, and the time complicated will be very high. So,a mining algorithm of the positively correlated frequent itemsets is proposed,when mining positively correlated itemsets,the method is selecting the predictive rules by confidence and generating classifiers which are composed by positively correlated association rules. The experiments show that,this algorithm can decrease the number of generated frequent itemsets largely,the classification accuracy is close to C4.5 and CMAR, and the time complicated degree has been reduced obviously.
出处
《华北水利水电学院学报》
2008年第4期65-67,共3页
North China Institute of Water Conservancy and Hydroelectric Power
关键词
频繁项集
关联规则
分类
正相关
frequent itemsets
association rules
classification
positivity