摘要
传统的关联规则和基于效用的关联规则,会忽略一些支持度或效用值不高、置信度(又称可信度)却非常高的规则,这些置信度很高的规则能帮助人们满足规避风险、提高成功率的期望。为挖掘这些低支持度(或效用值)、高置信度的规则,提出了HCARM算法。HCARM采用了划分的方法来处理大数据集,利用新的剪枝策略压缩搜索空间。同时,通过设定长度阈值minlen,使HCARM适合长模式挖掘。实验结果表明,该方法对高置信度长模式有效。
Both traditional association rule mining and utility based association rule mining may neglect those rules whose support or utility is not high.Although these rules'support or utility is not very high,they can satisfy those people whose main goal is to avoid risks or raise the rate of success.In order to mine the rules with a low suppor(tor utility)and a high confidence,this paper proposes a new algorithm:HCARM.HCARM adopts partition method to handle large data,and prune out candidates by using new pruning strategy.In the meantime,by giving a proper length threshold minlen,HCARM can be fitter for long patterns mining.Experiments on synthetic data show that the method can get a good performance in mining high confidence long patterns.
出处
《计算机工程与应用》
CSCD
北大核心
2010年第24期151-153,共3页
Computer Engineering and Applications
基金
教育部博士点基金(No.20060255006)~~
关键词
关联规则
高置信度
长模式
剪枝策略
association rule
high confidence
long pattern
pruning strategy