摘要
基于嵌套滑动窗口和遗传算法的NSWGA(nested sliding window genetic algorithm)算法对快速挖掘数据流频繁项集进行了研究。NSWGA算法利用遗传算法的并行性来搜索嵌套子窗口内最新数据的频繁项集,合并形成滑动窗口内待选频繁项集,然后扫描获得滑动窗口内的近期频繁项集。NSWGA算法及时准确捕获数据流上最新频繁项集,周期性地删除过期的流数据,并通过嵌套窗口以及遗传算法的并行处理,降低了计算的时间复杂度。
A new approach-NSWGA(nested sliding window genetic algorithm) for mining frequent itemsets over data stream based on nested sliding window and genetic algorithm is proposed.NSWGA uses the parallelism of genetic algorithm to search for the frequent itemsets of the latest data in the nested sub-window.This series of frequent itemsets are merged together.The final frequent itemsets of the sliding window is obtained by scanning the merged itemsets.NSWGA captures the latest frequent itemsets accurately and timely over data stream.At the same time the expired data is deleted periodically.As the use of nested windows and the parallel processing capability of genetic algorithm,this method also reduced the time complexity.
出处
《计算机工程与设计》
CSCD
北大核心
2011年第4期1307-1310,1346,共5页
Computer Engineering and Design
基金
国家自然科学基金项目(60703101)
关键词
数据流
频繁项集
遗传算法
嵌套滑动窗口
并行计算
data stream
frequent itemsets
genetic algorithm
nested sliding window
parallel computation