期刊文献+

改进K-means算法的MapReduce并行化研究 被引量:7

Parallelization Study of Improved K-means Algorithm on MapReduce Programming Model
下载PDF
导出
摘要 针对K-means在处理海量数据时,因初始聚类中心的选取不确定,从而导致收敛速度过慢的问题,本文提出了改进的K-means算法,首先用模糊聚类的思想对数据集进行模糊分类,其次采用动态计算聚类中心的方式对数据集进行二次分类,最后将算法在MapReduce模型上进行了实现.实验结果表明,改进后的算法不仅提高了加速比,而且算法的收敛速度更快. Because the selection of the initial clustering center is not sure, K-means algorithm has slow conver- gence speed when it is dealing with massive amounts of data. This paper introduced an improved k-means algorithm. Firstly, the idea of fuzzy clustering is introduced to classify the datasets. Secondly, the datasets are reclassified by means of dynamic clustering center. Finally, the improved algorithm is tested on MapReduce programming model. The experimental results show that the improved algorithm not only has a higher speedup, but also has a faster convergence.
出处 《哈尔滨理工大学学报》 CAS 北大核心 2016年第1期31-35,共5页 Journal of Harbin University of Science and Technology
基金 黑龙江省教育厅科学技术研究项目(12531107)
关键词 聚类 MAPREDUCE K—means 加速比 clustering algorithm MapReduce K-means speedup
  • 相关文献

参考文献20

二级参考文献240

共引文献713

同被引文献45

引证文献7

二级引证文献36

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部