期刊文献+

云环境下并行DBSCAN聚类算法研究

Research on Parallel DBSCAN Clustering Algorithm in Cloud Environment
下载PDF
导出
摘要 DBSCAN算法是一种基于密度的快速聚类算法,虽然在处理大规模数据时可以发现其中的噪声数据,但聚类效率不高,输入/输出消耗大,聚类结果准确率低。本文在云计算平台Hadoop环境下,将MapReduce编程模型的高并行性引入该算法,设计出一种并行DBSCAN算法,提高传统DBSCAN算法的执行效率,通过对比实验结果证明了该算法聚类的准确性和时效性。 DBSCAN algorithm is a density-based fast clustering algorithm. Although the noise data can be found when dealing with large-scale data,the clustering efficiency is not high,the input/output consumption is large and the accuracy of clustering results is low. In this paper,the parallelism of the MapReduce programming model is introduced into the Hadoop environment,and a parallel DBSCAN algorithm is designed to improve the efficiency of the traditional DBSCAN algorithm. The accuracy of the algorithm is proved by comparing the experimental results and timeliness.
作者 邓青 杨宁
出处 《山西电子技术》 2017年第6期87-90,共4页 Shanxi Electronic Technology
关键词 聚类分析 云计算 DBSCAN HDFS MAPREDUCE clustering analysis cloud computing DBSCAN HDFS MapReduce
  • 相关文献

参考文献3

二级参考文献23

  • 1张石磊,武装.一种基于Hadoop云计算平台的聚类算法优化的研究[J].计算机科学,2012,39(S2):115-118. 被引量:29
  • 2江小平,李成华,向文,张新访,颜海涛.k-means聚类算法的MapReduce并行化实现[J].华中科技大学学报(自然科学版),2011,39(S1):120-124. 被引量:79
  • 3周水庚,周傲英,金文,范晔,钱卫宁.FDBSCAN:一种快速 DBSCAN算法(英文)[J].软件学报,2000,11(6):735-744. 被引量:42
  • 4Han J W, Kamber M. Data mining: concepts and techniques [M]. San Francisco, US: Morgan Kaufmann, 2001.
  • 5Buyya R, Yeo C S, Venugopal S. Market-oriented cloud computing: vision,hype, and reality for delivering IT services as computing utilities, Keynote Paper [C] // Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications. Dalian, China, 2009 :25-27.
  • 6Armbrust M, Fox A. Above the clouds: a Berkeley view of cloud computing[R]. USA: University of California at Berkeley, 2009.
  • 7Erdogmus H. Cloud computing., does nirvana hide behind the nebula[J]. IEEE Software, 2009,26 (2) : 4-6.
  • 8Ghemawat S,Gobioff H, Leung S. The google file system[J].S ACM SIGOPS Operating Systems Review, 2003,37 (5) : 29-43.
  • 9Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters [C] /// Proceedings of Operating Systems Design and Implementation. San Franciseo, CA, 2004 : 137-150.
  • 10Xu X W, Jager J, Kriegel H P. A fast parallel clustering algorithm for large spatial databases[J]. Data Mining and Knowledge Discovery,1999,3(3) :263-290.

共引文献137

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部