摘要
针对制约Apriori算法效率的瓶颈问题,提出了一种对Apriori算法改进的策略,该策略利用二维数组标志位进行事务压缩和利用项集有序性进行项目压缩相结合。该算法减少连接次数以及扫描数据库的次数从而缩短数据库扫描时间,利用项集有序性改进判断是否进行连接的策略,并利用标志位变化逐步消除无用事务,从而实现了事务压缩和项目压缩,同时减少了判断时间。实验结果表明,经过优化了的Apriori算法在运行效率上有一定的提高。
For the bottlenecks of the Apriori algorithm, which restrict the efficiency of the Apriori algorithm, an optimized method was presented, which can take advantage of a two-dimensional array marker bit to achieve transaction reduction in association with taking advantage of order item to achieve item reduction. Reducing the times of joining as well as the number of scannings of the database will shorten the scan time. This algorithm takes advantage of order itemsets to improve the strategy, which is used to determine whether to join or not. And it removes useless transactions step by the step based on the transformation of a marker bit to reduce the number of transactions and items, while reducing the time of judgment. The results of an experiment show that the improved algorithm is more efficient.
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2008年第11期67-71,共5页
Journal of Shandong University(Natural Science)
关键词
关联规则
APRIORI算法
二维数组
事务压缩
项集有序
项目压缩
association rule
Apriori algorithm
two-dimensional array
transaction reduction
order itemsets
item reduction