期刊文献+

基于概要数据结构的高维数据流聚类算法

High-dimensional Data Streams Clustering Algorithm Based on Synopsis Data Structure
下载PDF
导出
摘要 为了在高维数据流中有效地形成聚类,针对经典算法CELL-Tree存在的问题,提出一种新的概要数据结构PL-Tree以及基于此数据结构的算法PLStream,并采取衰减窗口模式来适应数据流的变化,采用剪枝策略控制内存中聚类模型的规模.实验表明,PLStream算法能较好地适应高维数据流,比CELL-Tree算法具有更好的时间和空间效率. To form clustering effectively in the high-dimension data streams, focusing on the questions in the classical algorithm CELL-Tree, a new synopsis data structure PL-Tree and the algorithm PLStreem based on it are put forward. And the damped window model is adopted to adjust to the changes of the data streams. The cluster model's scale of the internal storage is controlled by pruning methods. The research suggests that the PLStream algorithm can adjust to the high-dimensional data stream better and is superior to the CELL-Tree algorithm in better temporal and spatial efficiency.
作者 王冬秀 李辉
出处 《广西工学院学报》 CAS 2011年第4期59-64,共6页 Journal of Guangxi University of Technology
基金 广西工学院基金项目(院科自0977101)资助
关键词 概要数据结构 高维数据流 聚类 synopsis data structure high-dimension data streams clustering
  • 相关文献

参考文献8

  • 1Brian Babcock, Shivnath Babu, Mayur Datar, et al. Models and Issues in Data Stream Systems [C]//PODS '02 Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, USA : [ s. n. ], 2002,1-16.
  • 2金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 3Mohamed Medhat Gaber, Arkady Zaslavsky, Shonali Krishnaswamy. Mining Data Streams: A Review [J]. ACM SIGMOD Record, 2005, 34(2) : 18-26.
  • 4周晓云,孙志挥,张柏礼,杨宜东.高维数据流子空间聚类发现及维护算法[J].计算机研究与发展,2006,43(5):834-840. 被引量:17
  • 5Charu C Aggarwal, Joel L Wolf, Philip S Yu , et al. Fast Algorithms for Projected Clustering [C]//SIGMOD'99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data, New York: [s. n. ], 1999,61-72.
  • 6Christian Baumgartner, Claudia Plant , Karin Kailing, et al. Subspace Selection for Clustering High-Dimensional Data [ C ]//Fourth IEEE International Conference on Data Mining (ICDM'04), Germany:[s. n. ] ,2004,11-18.
  • 7Nam Hun Parka, and Won Suk Lee. Cell Trees: An Adaptive Synopsis Structure for Clustering Multi-dimensional On-hne Data Streams [J ]. Data & Knowledge Engineering, 2007, 63 (2) : 528-549.
  • 8Charu C Aggarwal, Jiawei Wei Hart , Jianyong Wang,et al. A framework for clustering evolving data streams [C]//VLDB '03 Proceedings of the 29th international conference on Very large data bases, Germany: [ s. n. ], 2003 : 81 - 92,.

二级参考文献68

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2Babcock B, Babu S, Datar M, Motwani R, Widom J. Models and issues in data streams. In: Popa L, ed. Proc. of the 21st ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. Madison: ACM Press, 2002. 1~16.
  • 3Terry D, Goldberg D, Nichols D, Oki B. Continuous queries over append-only databases. SIGMOD Record, 1992,21(2):321-330.
  • 4Avnur R, Hellerstein J. Eddies: Continuously adaptive query processing. In: Chen W, Naughton JF, Bernstein PA, eds. Proc. of the 2000 ACM SIGMOD Int'l Conf. on Management of Data. Dallas: ACM Press, 2000. 261~272.
  • 5Hellerstein J, Franklin M, Chandrasekaran S, Deshpande A, Hildrum K, Madden S, Raman V, Shah MA. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 2000,23(2):7-18.
  • 6Carney D, Cetinternel U, Cherniack M, Convey C, Lee S, Seidman G, Stonebraker M, Tatbul N, Zdonik S. Monitoring streams?A new class of DBMS applications. Technical Report, CS-02-01, Providence: Department of Computer Science, Brown University, 2002.
  • 7Guha S, Mishra N, Motwani R, O'Callaghan L. Clustering data streams. In: Blum A, ed. The 41st Annual Symp. on Foundations of Computer Science, FOCS 2000. Redondo Beach: IEEE Computer Society, 2000. 359-366.
  • 8Domingos P, Hulten G. Mining high-speed data streams. In: Ramakrishnan R, Stolfo S, Pregibon D, eds. Proc. of the 6th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. Boston: ACM Press, 2000. 71-80.
  • 9Domingos P, Hulten G, Spencer L. Mining time-changing data streams. In: Provost F, Srikant R, eds. Proc. of the 7th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM Press, 2001. 97~106.
  • 10Zhou A, Cai Z, Wei L, Qian W. M-Kernel merging: Towards density estimation over data streams. In: Cha SK, Yoshikawa M, eds. The 8th Int'l Conf. on Database Systems for Advanced Applications (DASFAA 2003). Kyoto: IEEE Computer Society, 2003. 285~292.

共引文献173

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部