摘要
数据流的流量太大会无法被整个存储,或被多次扫描。为此,在研究已有挖掘算法的基础上,提出一种界标窗口中数据流频繁模式挖掘算法DSMFP_LW。利用扩展前缀模式树存储全局临界频繁模式,实现单遍扫描数据流和数据增量更新。实验结果表明,与Lossy Counting算法相比,DSMFP_LW算法具有更好的时空效率。
For data traffic flow is too large to store the entire data stream or on its scan times and other issues,through the research of algorithms on mining frequent patterns that are proposed,this paper proposes an algorithm on mining frequent patterns over data stream based on Landmark window,named DSMFP_LW.DSMFP_LW has major features as follows: namely single streaming data scan for counting pattern's information,extended prefix-tree-based compact pattern representation,and incremental update of data.Experimental results show that DSMFP_LW algorithm has better utilization of time and space efficiency.In addition,it outperforms the well-known algorithm Lossy Counting in the same streaming environment.
出处
《计算机工程》
CAS
CSCD
2012年第1期55-58,61,共5页
Computer Engineering
基金
海南省自然科学基金资助项目(610221
109002
808155)
海南师范大学青年科研基金资助项目(QN0923)