摘要
针对传统数据流频繁项集挖掘算法在支持度更新、窗口更新方式、频繁k-项集挖掘等方面存在的一系列问题,造成空间和时间效率不高,改进研究了一种高效挖掘数据流频繁项集的AO算法。采用滑动窗口思想,对数据流分块挖掘;在满窗口有新数据流入时,采用取余插入完成数据更新;挖掘频繁k-项集采用And Operation求解支持度,并在挖掘过程结合超集检测,极大地提高了挖掘效率。实验结果表明,该算法在时间和空间效率上均有一定的优越性。
In view of a series of problems existing in support update,window update mode and frequent k-itemset mining of traditional frequent itemset mining algorithm in data flow,which results in low efficiency of space and time,an efficient AO algorithm for mining frequent itemsets in data streams is improved.The algorithm uses the idea of sliding window to mine the data stream in blocks;when there is new data flowing in the full window,the residual insertion is used to update the data;and operation is used to solve the support degree of frequent k-itemsets,and the superset detection is combined in the mining process,which greatly improves the mining efficiency.The experimental results show that the algorithm has good superiority in both time and space efficiency.
作者
文凯
耿小海
朱璐伟
许萌萌
WEN Kai;GENG Xiao-hai;ZHU Lu-wei;XU Meng-meng(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065;Research Center of New Telecommunication Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065;Chongqing Information Technology Designing Co.,Ltd.,Chongqing 401121,China)
出处
《计算机工程与科学》
CSCD
北大核心
2020年第12期2259-2264,共6页
Computer Engineering & Science
关键词
数据流
超集检测
频繁项集
与运算
data stream
superset checking
frequent itemsets
And Operation