摘要
根据Apriori算法的原理,提出一种具有跳跃式前进与回退补齐的改进算法J_Apriori。计算频繁K项集后,求出未剪枝的候选2K项集。在满足跳跃式前进策略的条件下先求出频繁2K项集,则2K项集的所有(K+1)至(2K-1)项子集不需要再扫描庞大的数据集,可以直接加入到频繁项集中,然后再回退补齐那些不是2K项集的子集的频繁项集。改进的算法减少了扫描数据集的次数。实验表明改进的算法有效地提高了Apriori算法的效率。
According to the principle of Apriori algorithm,we propose a kind of improved Apriori algorithm with jumping forward and backing fill which is called J_Apriori. After computing frequent itemsets K,we get the candidate 2K itemsets without being pruned. When the candidate 2K itemsets are found first under the condition of meeting the jump forward strategy,then all the( K + 1) to( 2K- 1) sub-itemsets of 2K frequent itemsets need not to scan huge datasets and can be added to the frequent itemsets directly,and then back to fill the frequent itemsets which are not the subsets of 2K frequent itemsets. The improved algorithm reduces the number of scanning the datasets. Experiments show that the improved algorithm effectively raises the efficiency of Apriori algorithm.
出处
《计算机应用与软件》
CSCD
2015年第3期34-36,92,共4页
Computer Applications and Software