期刊文献+

密度网格参数自适应的数据流聚类算法 被引量:2

Density Grid-Based Data Stream Clustering Algorithm with Parameter Automa-tization
下载PDF
导出
摘要 针对传统密度网格算法在聚类中自动获取密度阈值不够精确的问题,提出了一种基于密度网格参数自适应的数据流聚类算法A-Stream。通过引入"双密度阈值",并以平均值作为密度阈值,对传统聚类算法进行了改进,解决了算法不能获取精确值的问题。实验结果表明,A-Stream算法不仅保留了传统密度网格算法的高效性,而且较大程度上提高了聚类精度。 For the problem that traditional density grid-based stream clustering algorithm cannot get accurate density value, this paper introduces a new density grid-based stream clustering algorithm with parameter automatization A-Stream. Through the introduction of the double density, the traditional density grid-based clustering algorithm for data stream is improved by taking the average as the grid density, resolving the problem that algorithm cannot get accurate value automatically. The experimental results show that not only the high efficiency of the grid-based algorithm is utilized, but also the clustering accuracy is raised significantly.
出处 《计算机科学与探索》 CSCD 2011年第10期953-958,共6页 Journal of Frontiers of Computer Science and Technology
关键词 聚类 数据流 网格 参数自适应 密度阈值 clustering data stream grid parameter adaptation density threshold
  • 相关文献

参考文献4

二级参考文献62

  • 1Cormode G, Garofalakis M. Sketching probabilistic data streams. In: Chan CY, Ooi BC, Zhou A, eds. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Beijing: ACM Press, 2007. 281-292.
  • 2Jayram TS, McGregor A, Muthukrishan, Vee E. Estimating statistical aggregates on probabilistic data streams. In: Libkin L, ed. Proc. of the 26th ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems. Beijing: ACM Press, 2007. 243-252.
  • 3Jayram TS, Kale S, Vee E. Efficient aggregation algorithms for probabilistic data. In: Bansal N, Pruhs K, Stein C, eds. Proc. of the 18th Annual ACM-SIAM Syrup. on Discrete Algorithms (SODA). New Orleans: SIAM, 2007. 346-355.
  • 4Aggarwal CC, Han J, Yu PS. A framework for clustering evolving data streams. In: Freytag JC, Lockmann PC, Abiteboul S, Carey MJ, Seling PG, Heuer A, eds. Proc. of the Int'l Conf. on Very Large Data Bases. Berlin: Morgan Kaufmann Publishers, 2003. 81-92.
  • 5Dalvi N, Suciu D. Efficient query evaluation on probabilistic databases. In: Nascimento MA, Ozsu MT, Kossmann D, Miller RJ, Blakeley JA, Schiefer KB, eds. Proe. of the VLDB. Toronto: Morgan Kaufmarm Publishers, 2004. 864-875.
  • 6Burdick D, Deshpande PM, Jayram TS, Ramakrishnan R, Vaithyanathan S. OLAP over uncertain and imprecise data. In: Bohm K, Jensen CS, Haas LM, Kersten ML, Larson P, Ooi BC, eds. Proc. of the Int'l Conf. on Very Large Data Bases. Trondheim: ACM Press, 2005.970-981.
  • 7Sarma AD, Benjelloum O, Halevy A, Widom J. Working models for uncertain data. In: Liu L, Reuter A, Whang KY, Zhang J, eds. Proc. of the 22nd Int'l Conf. on Data Engineering. Atlanta: IEEE Computer Society, 2006.
  • 8Cheng R, Kalashnikov D, Prabhakar S. Querying imprecise data in moving object environments. IEEE Trans. on Knowledge and Data Engineering, 2004,16(9):1112-1127.
  • 9Ngai WK, Kao B, Chui CK, Cheng R, Chau M, Yip KY. Efficient clustering of uncertain data. In: Cliton CW, Zhong M, Liu JM, Wah BW, Wu XD, eds. Proc. of the 6th IEEE Int'l Conf. on Data Mining. Hong Kong: IEEE Computer Society, 2006. 436-445.
  • 10Guha S, Mishra N, Motwani R, Callaghan LO. Clustering data streams. In: Yong DC, ed. Proe. of the 41st Annual Symp. on Foundations of Computer Science. Redondo Beach: IEEE Computer Society, 2000. 359-366.

共引文献79

同被引文献15

  • 1刘青宝,戴超凡,邓苏,张维明.基于网格的数据流聚类算法[J].计算机科学,2007,34(3):159-161. 被引量:10
  • 2Guha S, Mishra N, Motwani R, et al. Clustering Data Stream: Theory and Practice[J]. IEEE Transactions on Knowledge and Data Engineering, 2003, 15(3): 515-528.
  • 3O'Callaghan L, Mishra N, Meyerson A, et al. Streaming Data Algorithms for High-quality Clustering[C]//Proc. of the 18th International Conference on Data Engineering. [S. 1.]: IEEE Press, 2002: 685-704.
  • 4Aggarwal C C, Han Jiawei, Wang Jianyong, et al. A Frame- work for Clustering Envolving Data Streams[C]//Proc. of the 29th International Conference on Very Large Data Bases. [S. 1.]: ACM Press, 2003: 81-92.
  • 5Zhang Tian, Ramakrishnan R, Livny M. BIRCH: An Efficient Data Clustering Method for Very Large Databases[C]//Proc. of ACM SIGMOD International Conference on Management of Data. New York, USA: ACM Press, 1996:103-114.
  • 6Chen Yixin, Tu Li. Density-based Clustering for Real-time Stream Data[C]//Proc. of the 13th ACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press, 2007: 133-142.
  • 7Nasereddin H H O. Stream Data Mining[J]. Computer and Information Science, 2009, 1(8): 183-190.
  • 8高永梅,黄亚楼.一种基于网格和密度的数据流聚类算法[J].计算机科学,2008,35(2):134-137. 被引量:6
  • 9胡彧,闫巧梅.滑动窗口模型下的优化数据流聚类算法[J].计算机应用,2008,28(6):1414-1416. 被引量:6
  • 10单世民,张宁,江贺,张宪超.基于网格和密度的簇边缘精度增强聚类算法[J].计算机工程与应用,2008,44(23):143-146. 被引量:4

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部