期刊文献+

基于Storm的分布式实时数据流密度聚类算法 被引量:3

Distributed real-time data flow density clustering algorithm based on Storm
下载PDF
导出
摘要 基于经典流聚类框架CluStream和密度聚类算法DBSCAN,提出了一种分布式实时数据流密度聚类算法DBS-Stream,并在Storm流式处理平台上设计了算法实现方案.该算法局部节点使用CluStream的两段式经典框架,在线微聚类中利用DBSCAN代替K-means初始化数据,在中心节点再使用DBSCAN算法进行全局聚类.该算法可解决任意型聚类问题,并可使局部节点快速更新数据.将DBS-Stream算法与CluStream算法进行比较,实验结果表明,本研究算法在聚类质量和通信代价方面均优于CluStream. A distributed real-time data flow density clustering algorithm DBS-Stream is proposed on the platform of Storm,based on the classic flow clustering framework CluStream and density clustering algorithm DBSCAN. The local site of the algorithm adopts a two-stage classic frame and replaces the K-means initialization data with DBSCAN within the online micro clustering. And the center site makes use of the DBSCAN algorithm to realize the global clustering. The algorithm can solve any problem of clustering,and update the data of local site quickly. Compared with the CluStream,the experiment results show that the algorithm DBS-Stream has better performance on the clustering accuracy and communication cost.
作者 牛丽媛 张桂芸 NIU Liyuan;ZHANG Guiyun(College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, Chin)
出处 《天津师范大学学报(自然科学版)》 CAS 北大核心 2018年第3期72-76,共5页 Journal of Tianjin Normal University:Natural Science Edition
基金 国家自然科学基金资助项目(61572358) 天津市自然科学基金资助项目(16JCYBJC23600)
关键词 CluStream 数据流 DBSCAN STORM CluStream data flow DBSCAN Storm
  • 相关文献

参考文献4

二级参考文献44

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2陈卓,孟庆春,魏振钢,任丽婕,窦金凤.一种基于网格和密度凝聚点的快速聚类算法[J].哈尔滨工业大学学报,2005,37(12):1654-1657. 被引量:14
  • 3朱蔚恒,印鉴,谢益煌.基于数据流的任意形状聚类算法[J].软件学报,2006,17(3):379-387. 被引量:51
  • 4Han Jiawei,Kamber M.数据挖掘概念与技术[M].范明,孟小峰,译.2版.北京:机械工业出版社,2007.
  • 5陆锋 段滢滢 袁文.LBS的数据处理技术[J].中国计算机学会通讯,2010,.
  • 6Guha S, Meyerson A, Mishra N, Motwani R, O'Callaghan L. Clustering data streams: theory and practice. IEEE Trans-actions on Knowledge and Data Engineering, 2003, 15(3): 515-528.
  • 7Han J W, Kamber M. Data Mining Concepts and Tech- niques. Beijing: China Machine Press, 2006. 196-211.
  • 8Ester M, Kriegel H P, Sander J, Xu X W. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Confer- ence on Knowledge Discovery and Data Mining. Portland, USA: AAAI Press, 1996. 226-231.
  • 9Sander J, Ester M, Kriegel H P, Xu X W. Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Mining and Knowledge Discov- ery, 1998, 2(2): 169-194.
  • 10Hinneburg A, Keim D A. An efficient approach to clustering in large multimedia databases with noise. In: Proceedings of the 4th International Conference on Knowledge Discov- ery and Data Mining. New York, USA: AAAI Press, 1998. 58-65.

共引文献48

同被引文献62

引证文献3

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部