摘要
随着大数据时代的到来,作为关联规则挖掘的经典算法,Apriori算法受到了广泛的关注和研究,论文在总结现有研究的基础上提出了一种基于链表的改进Apriori算法。该算法首先扫描事务数据库计算频繁-1项集并采用链表进行压缩存储,避免了重复扫描事务数据库带来的额外开销,然后在频繁-N项集(N≥1)的基础上利用高效的位运算对链表进行合并操作生成频繁N+1项集,对频繁N+1项集(N≥1)的产生过程进行了优化,提高了Apriori算法的效率。
With the advent of the era of big data,Apriori algorithm as a classical algorithm of association rules mining,it has been widely concerned and studied.Based on the summary of existing research,this paper proposes an improved Apriori algorithm based on linked list.Firstly,the algorithm scans the transaction database to compute frequent-1 itemsets and compresses them with linked lists,which avoids the additional overhead of scanning the transaction database repeatedly.Then,on the basis of frequent-N itemsets(N≥1),it combines the linked lists with efficient bit operations to generate frequent-N+1 itemsets.The generation process is optimized to improve the efficiency of the Apriori algorithm.
作者
顾鹏
GU Peng(School of Computer Engineering and Science,Nanjing University of Science and Technology,Nanjing 210094)
出处
《计算机与数字工程》
2020年第5期1024-1028,1044,共6页
Computer & Digital Engineering