期刊文献+

一种基于概率密度的数据流聚类算法

Data stream clustering algorithm based on probability density
下载PDF
导出
摘要 数据流具有数据量无限且流速快等特点,使得传统的聚类算法不能直接应用于数据流聚类问题。针对该问题,提出了一种基于概率密度的数据流聚类算法。此方法不需要存储全部的历史数据,只需要存储新到达的数据并对其应用EM算法,利用高斯混合模型增量式地更新概率密度函数。实验表明,该算法对于解决数据流聚类问题非常有效。 Data stream is characterized by infinite data and quick stream speed, so traditional clustering algorithm cannot be applied to data stream clustering directly, In view of above questions, a probability-density-based data stream clustering algorithm was proposed. It requires only newly arrived data, not the entire historical data, to be saved in memory. It applies EM algorithm on the newly arrived data and updates probability-density function by incremental Gaussian mixture model. Experimental results show that the algorithm is very effective to solve data stream clustering.
作者 张伟 陈春燕
出处 《计算机应用》 CSCD 北大核心 2007年第4期881-883,共3页 journal of Computer Applications
关键词 数据流 聚类 高斯混合模型 概率密度 data stream clustering Gaussian mixture model probability-density
  • 相关文献

参考文献9

  • 1GOLAB L,OZOM M.Issues in Data Stream Management[J].SIGMOD Record,2003,32(2):5-14.
  • 2HAN J,KAMBER M.Data Mining Concepts and Techniques[M].Beijing:Higher Education Press,2001.223 -262.
  • 3BABCOCK B,BABU S,DATAR M,etal.Model and Issues in Data Stream Systems[A].Proc of ACM SIGMOD/SIGACT Conf on Princ of data Syst[C].Madison:ACM Press,2002.1-16.
  • 4AGGARWAL C,HAN J,WANG J,et al.A Framework for Clustering Evolving Data Streams[A].Conference on Very Large Data Bases[C].Berlin:VLDB conference,2003.312-323.
  • 5SONG M,WANG H.Highly Efficient Incremental Estimation of Gaussian Mixture Models for online Data Stream Clustering[A].Proceedings of SPIE:Intelligent Computing-Theory and Application Ⅲ[C].Florida,2005,5803:174-183.
  • 6SONG M,WANG H.Detecting Low Complexity Clusters by Skewness and Kurtosis in Data Stream Clustering[A].Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics[C].Florida:Proceedings of AIM,2006.1-8.
  • 7DANIEL B.Requirements for Clustering Data Streams[J].SIGKDD Exploration,2003,3(2):23 -27.
  • 8GUHA S,MISHRA N,MOTWANI R,et al.Clustering Data Stream[A].The 41st Annual Symp.on Fundations of Computer Science,FOCS 2000[C].Redondo Beach:IEEE Computer Science,2000.359-366.
  • 9AGGARWAL C,HAN J,WANG J,et al.A Framework for Projected Clustering of High Dimensional Data Streams[A].Proceeding of the 30th very large data bases conference[C].Toronto:VLDB conference,2004.852-863.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部