期刊文献+

基于滑动窗口的动态数据流聚类算法研究

The Clustering Algorithm for Evolving Dynamic Data Stream over Sliding Windows
下载PDF
导出
摘要 数据流聚类算法是当前数据流研究领域里的重要分支,而滑动窗口是数据流中一种关注近期数据的近似方法,提出一种采用滑动窗口处理数据的优化算法SWStream.算法采用双层架构思想,在线阶段利用滑动窗口树存储概要结构,动态调整窗口大小.而在离线阶段对上一阶段的结果进行宏聚类,得到最后的结果.实验验证本算法有更高的处理效率,也相对节约内存. Data stream clustering algorithm is important branch on current research in the field of data streams. Sliding window is one kind of approximation methods concerned about the recent data streams. This paper proposes an optimization algorithm SWStream which processes data over sliding window. In the online component, the sliding window tree is introduced to store the important statistical information of data streams, and adjusting the sizes of sliding windows. In the offline component, the mean values of the micro-clusters are macro-clustered, the final clustering results are abtained. The experiments verify that the algorithm has a higher processing efficiency, and saves memory.
作者 许颖梅
出处 《河南科学》 2014年第5期777-780,共4页 Henan Science
基金 河南省科技厅研究计划项目(132300410395 122300410395)
关键词 数据流 滑动窗口 聚类 数据挖掘 data streams; sliding windows; clustering; data mining
  • 相关文献

参考文献9

二级参考文献97

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2朱蔚恒,印鉴,谢益煌.基于数据流的任意形状聚类算法[J].软件学报,2006,17(3):379-387. 被引量:51
  • 3王伟平,李建中,张冬冬,郭龙江.一种有效的挖掘数据流近似频繁项算法[J].软件学报,2007,18(4):884-892. 被引量:33
  • 4Gaber M M, Zaslavsky A, Krishnaswamy S. A Cost-efficient model for ubiquitous data stream mining[C]. Tenth International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2004), Perugia Italy, July 4-9.
  • 5Gaber M M, Krishnaswamy S,Zaslavsky A. Cost-efficient mining techniques for data streams[C]. Australasian Workshop on Data Mining and Web Intelligence (DMWI2004), Dunedin, New Zealand. CRPIT, 32. Purvis, M. ,Ed. ACS.
  • 6Gaber M M,Krishnaswamy S, Zaslavsky A. Adaptive mining techniques for data streams using algorithm output granularity [C]. The Australasian Data Mining Workshop (AusDM 2003), Held in conjunction with the 2003 Congress on Evolutionary Computation (CEC 2003), December, Canberra, Australia, Springer Verlag, Lecture Notes in Computer Science (LNCS).
  • 7Chalaghan LO,Mishra N, Meyerson A,et al. Streaming data algorithms for high-quatlty clustering[C]. Proc.of the 18th Int'l Conf. on Data Engineering. San Jose, 2002,685-694.
  • 8Gaber M, Krishnaswamy S,Zaslavsky. A ubiquitous data stream mining [C]. Current Research and Future Directions Workshop Proceedings Held in Conjunction with PAKDD 2004, Sydney, Australia, May 26 2004.
  • 9Aggarwal C C,Han J ,Wang J ,et al. A framework for clustering evolving data streams[C]. Proc. of VLDB, 2003.
  • 10Shah R,Krishnaswamy S,Gaber M M. Resource-aware very fast K-Means for ubiquitous data stream mining[C]. Proceedings of Second International Workshop on Knowledge Discovery in Data Streams, to be Held in Conjunction with 16th European Conference on Machine Learning (ECML 2005) and the 9th European Conference on the Principals and Practice of Knowledge Discovery in Databases (PKDD 2005), Porto, Portugal, October 3-7, 2005.

共引文献196

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部