期刊文献+

滑动窗口内动态数据流聚类算法研究

The study of clustering algorithm for evolving dynamic data stream over sliding windows
下载PDF
导出
摘要 滑动窗口是数据流中一种关注近期数据的近似方法,提出一种采用滑动窗口处理数据的优化算法SWStream。在线阶段利用滑动窗口树存储概要结构,动态调整窗口大小。优化后的算法能及时淘汰过期元组,同时对新到达的元组不断进行实时处理,可以获得更准确的分析结果。而在离线阶段对上一阶段的结果进行宏聚类,得到最后的结果。与聚类算法CluStream相比,此算法处理数据的效率更高,也相对节约内存。 Sliding window is one kind of approximation methods on recent data in data streams .This paper proposes an optimization algorithm SWStream which processes data over sliding window .In the online component , the sliding window tree is introduced to store the important statistical information of data streams , and adjust the sizes of sliding windows .Optimized algorithm can promptly eliminate expired tuple , and the new tuples arrive continuously in real-time processing , which can achieve more accurate results .In the offline component, by employing the mean value of the macro-clusters, generate the final clustering results .Com-pared with clustering algorithm CluStream , this algorithm is more efficient on data processing and memory sav-ing.
作者 许颖梅
出处 《陕西理工学院学报(自然科学版)》 2014年第1期42-46,共5页 Journal of Shananxi University of Technology:Natural Science Edition
基金 河南省科技厅研究计划项目(132300410395) 河南省科技厅研究计划项目(122300410395)
关键词 数据流 滑动窗口 聚类 数据挖掘 data streams sliding windows clustering data mining
  • 相关文献

参考文献12

二级参考文献100

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2朱蔚恒,印鉴,谢益煌.基于数据流的任意形状聚类算法[J].软件学报,2006,17(3):379-387. 被引量:51
  • 3王伟平,李建中,张冬冬,郭龙江.一种有效的挖掘数据流近似频繁项算法[J].软件学报,2007,18(4):884-892. 被引量:33
  • 4常建龙,曹锋,周傲英+.基于滑动窗口的进化数据流聚类[J].软件学报,2007,18(4):905-918. 被引量:61
  • 5Carney D,Cetintemel U,Cherniack M,et al.Monitoring streams-a new class of data management applications[C].International Conference on Very Large Data Bases(VLDB),Hong Kong,China,2002,215-226.
  • 6O′Callaghan L,Mishra N,Meyerson A,et al.Streaming-data algorithms for high quality clustering[C].Proceedings of IEEE International Conference on Data Engineering,March,2002.
  • 7Aggarwal C,Han J,Wang J,et al.A framework for clustering evolving data streams[C].Proceedings of the 29th VLDB Conference,Berlin,Germany,2003.
  • 8Aggarwal C,Han J,Wang J,et al.A framework for projected clustering of high dimensional data streams[C].Proceedings of the 30th VLDB Conference,Toronto,Canada,2004.
  • 9Chen Y,Tu L.Density-based clustering for real-time stream data[C].KDD′07,August 12-15,2007,San Jose,California,USA.133-142.
  • 10Cao F,Ester M,Qian W,et al.Density-based clustering over an evolving data stream with noise[C].Proceedings of the SIAM Conference on Data Ming,2006.

共引文献217

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部