摘要
在数据流挖掘中,界标窗体考虑了历史模式对当前挖掘的影响,但没考虑到随时间的推移模式衰减的问题。滑动窗口能记录最新、最有用的模式,但窗口的最佳大小无法准确确定。针对一些仿真系统中具有数据流特点的数据,提出了一种挖掘混合窗口中闭频繁项集的方法T-Moment。该方法能在单遍扫描数据流的条件下完整地记录模式信息。同时,T-Moment提出的减枝方法能很好地降低滑动窗口树F-tree的空间复杂度与闭频繁模式树T-tree的维护代价。此外,该方法提出的时间衰减机制能区分历史和最新模式。大量仿真实验结果表明,T-Moment有很好的效率和准确性。
In data mining,boundary window considers the influence of history pattern to the current mining result,but do not think over mode decaying as time passed. Sliding window can record the latest and most useful patterns,but the best size can not be accurately determined. To aim at data with the characteristics of data flow in some simulation systems,a method for mining the closed frequent patterns in the mixed window of data stream was proposed. The pattern of data stream could be completely recorded by scanning the stream only once. And the pruning method of T-Moment could reduce the space complexity of sliding window tree and the maintenance cost of the closed frequent patterns tree. To differentiate the historical and the latest patterns,a time decaying model was applied. The experimental results show that the algorithm has good efficiency and accuracy.
出处
《系统仿真学报》
CAS
CSCD
北大核心
2010年第9期2110-2114,2119,共6页
Journal of System Simulation
基金
国家高技术研究发展计划(863)2007AA04Z116
国家自然科学基金70871033~~
关键词
仿真数据
闭频繁模式
混合窗体
时间衰减
simulation data
closed frequent pattern
mixed window
time decaying