摘要
Apriori算法是关联规则挖掘算法中应用最为广泛的一种算法,它的主要目的是从大量的事务数据中通过候选项集挖掘出有趣的频繁项集,从而为用户提供有意义的关联关系。但随着数据库规模的扩大,apriori算法可能会产生如下两大棘手问题:大量候选项集的产生将造成巨大计算量的浪费;为剪掉无用候选项如何设置阈值。这些问题相对于众多普通用户来说都具有挑战性。该文提出的代码与运算是一种无须候选项挖掘频繁项集的算法,用户无须为设置阈值而煞费苦心。同时事务压缩算法的加入大大减少了算法中的计算量。
This paper gives an efficient algorithm for association rule mining,namely And Code(AC)algorithm.AC algo-rithm can discover all frequent itemsets from transaction database quickly without candidate generations.Compared with apriori algorithm,it avoids great amounts of candidates and some exact or experienced thresholds for these candidates.The steps of AC algorithm is:firstly it makes one corresponding code for every itemset according to coding rules after transaction reduction,Secondly it runs And algorithm for itemset codes so as to achieve all frequent itemset codes,Last step will transform these codes into corresponding itemsets,then these itemsets can be classed into frequent itemsets according to the support thresholds of frequent itemsets.
出处
《计算机工程与应用》
CSCD
北大核心
2004年第15期182-185,共4页
Computer Engineering and Applications