期刊文献+

基于衰减窗口与剪枝维度树的实时数据流聚类 被引量:4

Real-time data stream clustering based on damped window and pruning dimension tree
下载PDF
导出
摘要 提出一种基于衰减窗口的实时数据流聚类算法PDStream。算法首先对数据空间进行网格划分,采用改进的维度树结构维护和更新数据流的摘要信息,设计了一种周期性剪枝策略,周期性地剪去维度树中的稀疏网格,最后采用深度优先搜索算法在线处理聚类请求。基于人工数据集和真实数据集的实验表明,PDStream算法可以有效地发现数据流中任意形状的聚类,内存消耗少,具有较好的计算精度。 This paper proposed a novel real-time data stream clustering algorithm PDStream, which was based on damped win- dow. PDStream firstly divided data space into grids, then used an improved dimension tree structure to maintain and update the data stream summary statistics. Designed a pruning strategy to prune the sparse grids in dimension tree periodically. Final- ly used the depth first search (DSF) method to deal with online clustering request. The experimental results on synthetic data- set and real dataset demonstrate that PDStream has the advantages of discovering clusters of arbitrary shape effectively, low memory consumption, preferable precision.
作者 张晓龙 曾伟
出处 《计算机应用研究》 CSCD 北大核心 2009年第4期1331-1334,1341,共5页 Application Research of Computers
基金 国家自然科学基金资助项目(60674115)
关键词 数据流 网格聚类 衰减窗口 维度树 剪枝策略 data stream grid clustering damped window dimension tree pruning strategy
  • 相关文献

参考文献11

  • 1GOLAB L. Issues in data stream management [ J ]. ACM SIGMOD Record, 2003, 32(2) : 5- 14.
  • 2BABCOCK B, BABU S, DATAR M, et al. Models and issues in data stream systems [ C ]//Proc of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. New York: ACM Press, 2002: 1- 16.
  • 3GABER M M, ZASLAVSKY A, KRISHNASWAMY S. Mining data streams: a review[J]. ACM SIGMOD Record, 2005, 34 (2) : 18- 26.
  • 4MUTHUKR/SHNAN S. Data streams: algorithms and applications [ M ]. Hanover: Now Publishers Inc, 2005.
  • 5GUHA S, MEYERSON A, MISHRA N, et al. Clustering data streams [ C]//Proc of the 41st Annual Symposium on Foundations of Computer Science. Washington DC : IEEE Computer Society, 2000 : 359- 366.
  • 6O'CALLAGHAN L. Streaming data algorithms for high quality clustering[C]//Proc of the 18th International Conference an Data Engineering. Massachusetts: IEEE Computer Society, 2002 : 685- 694.
  • 7AGGARWAL C C, HAN Jia-wei, WANG Jian-yong, et al. A framework for clustering evolving data streams[ C]//Proc of the 29th VLDB Conference. Berlin: [s. n. ], 2003: 81-92.
  • 8孙焕良.流数据聚类分析与孤立点检测算法的研究[D].沈阳:东北大学,2005.
  • 9AGGARWAL C C, HAN Jia-wei, WANG Jian-yong, et al. A framework for projected clustering of high dimensional data streams [ C ]// Proc of the 30th VLDB Conference. Toronto: [ s. n. ], 2004:852- 863.
  • 10CHEN Yi-xi, TU Li. Density-based clustering for real-time stream data [ C ]//Proc of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. California : [ s. n. ], 2007 : 133- 142.

共引文献1

同被引文献24

  • 1姚昱旻,刘卫国.Android的架构与应用开发研究[J].计算机系统应用,2008,17(11):110-112. 被引量:281
  • 2孙巍,王志梁.一种联系人通信紧密度的显示方法、系统及移动终端:中国,CN200910238828.X[P].2010-06-30.
  • 3何波.一种移动通讯终端中通信录排序的方法:中国,CN20081020-7555.8[P].2009-07-29.
  • 4Hui Li, Gejian Ding, Tingting Zhu. The Design and the Development of the PKM on the Android Platform[R]. International Conference on Information Networking and Automation (Kunming, China), 2010.
  • 5Hans Dulimarta.Using Android in an Introductory Java Course [R]. International Conference on Frontiers in Education: Computer Science and Computer Engineering (Las Vegas, Nevada, USA), 2009.
  • 6Minos Garofalakis. Distributed Data Streams. Yahoo! Research and Univ. of California, Berkely[ R/OL]. http ://www. softnet, tuc. gr/-minos/Papers/eds09 dstreams, pdf.
  • 7Charu C Aggarwal, Han Jiawei, Wang Jianyong, et al. A Framework for Clustering Evolving Data Streams [ C ]//Proceedings of the 29th VLDB Conference, Berlin, Germany, 2003.
  • 8Luo Ke, Wang Lin. Data Streams Clustering Algorithm Based on Grid and Particle Swarm Optimization [C]//2009 International Forum on Computer Science-Technology and Applications.
  • 9Amineh Amini, The Ying Wah. Density Micro-Clusteirng Algorithms on Data Streams: A Review [C]//Proceedings of the International MultiConference of Engineers and Computer Scientists 2011 Vol 1, IMECS 2011, March 16 -15, 2011, Hang Kong.
  • 10Martin Ester, Hans-Peter Kriegel, Jorg Sander, et al. A Density Algorithm for Discovering Clusters in Large Spatial Databases with Noise [ C ]//Proceedings of 2^nd International Conference on Knowledge Dis- covery and Data Mining( KDD-96 ).

引证文献4

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部