摘要
针对频繁模式增长算法无法适应数据流的无限性和流动性的特点,提出一种新颖的FP-tree的变形结构——FPS-tree,只需单遍扫描便能获取当前窗口的全部数据库信息。为了在滑动窗口时有效地删除过期窗格和插入新窗格,提出一个新颖的概念——"尾结点",FPS-tree中每条路径上的窗格信息只保持在尾结点里。实验结果表明FPS-tree的压缩性能要优于其他单遍扫描的前缀树结构。
Aiming at the problem that FP-growth algorithm can not adapt to the data stream with the characteristics of infinity and fluidity, this paper presents a novel variation structure of FP-tree called FPS-tree, which captures all database information in current window with one scan. For effectively deleting expired panes in sliding window, a novel concept-"tail-node" is proposed, so the information on pane in every path of FPS-tree is only retained in the tail-node. Experimental results show that compact performance of the FPS-tree is better than other prefix-tree structure with one scan.
出处
《计算机工程与应用》
CSCD
2013年第2期152-154,共3页
Computer Engineering and Applications
基金
国家科技支撑计划项目(No.2008BAB32B02)
湖南省科技计划项目(No.2010FJ3139)
湖南省教育厅科学研究项目(No.10C1311)
关键词
数据流
频繁模式增长算法
单遍扫描模式树
尾结点
data stream
Frequent Pattern(FP)-growth algorithm
single-pass pattern tree
tail-node