期刊文献+

基于改进的分布式K-Means特征聚类的海量场景图像检索 被引量:5

MASSIVE SCENE IMAGE RETRIEVAL BASED ON IMPROVED DISTRIBUTED K-MEANS FEATURE CLUSTERING
下载PDF
导出
摘要 针对传统的图像检索方法在处理海量数据时面临的问题,提出一种基于改进的分布式K-Means特征聚类的海量场景图像检索方法。对分布式K-Means算法进行改进,优化了初始聚类中心的选择和迭代过程,并将其应用与场景图像的特征聚类中;充分利用Hadoop分布式平台的海量存储能力和强大并行计算能力,提出了海量场景图像的存储和检索方案,设计了场景图像特征提取、特征聚类以及图像检索三个阶段分布式并行处理的Map和Reduce任务。多组实验表明,提出的方法数据伸缩率曲线平缓,取得了优良的加速比,效率大于0.6,检索的平均准确率达到了88%左右,适合海量场景图像数据的检索。 Concerning that traditional image retrieval methods are confronted with the problems when processing massive data,we put forward a retrieval method for massive scene images,which is based on improved k-means feature clustering.We improved the distributed K-means algorithm,optimised the selection of initial cluster centres and the iteration procedure,and applied it to feature clustering of scene images.We made full use of the massive storage capacity and the powerful parallel computing ability of Hadoop distributed platform,proposed the storage and retrieval scheme on massive scene image,and designed the Map and Reduce tasks of three-phase distributed parallel processing on scene image with feature extraction,feature clustering and image retrieval.Sets of experiments demonstrated that the proposed method has gentle curve of data expansion rate,achieves good speedup ratio,the efficiency is greater than 0.6,and the average accuracy rate of retrieval reaches about 88%.The proposed scheme is suitable for large-scale scene image data retrieval.
作者 崔红艳 曹建芳 Cui Hongyan;Cao Jianfang(Department of Computer Science and Technology ,Xinzhou Teachers University, Xinzhou 034000, Shanxi, China)
出处 《计算机应用与软件》 CSCD 2016年第6期195-199,267,共6页 Computer Applications and Software
基金 国家自然科学基金项目(61202163) 山西省高校大学生创新创业训练项目(2014383) 山西省自然科学基金项目(2013011017-2) 忻州师范学院重点学科专项课题(XK201308)
关键词 Hadoop分布式平台 MAPREDUCE 分布式K-Means算法 特征聚类 场景图像检索 Hadoop distributed platform MapReduce Distributed k-means algorithm Feature clustering Scene image retrieval
  • 相关文献

参考文献13

二级参考文献64

  • 1李清勇,胡宏,施智平,史忠植.基于纹理语义特征的图像检索研究[J].计算机学报,2006,29(1):116-123. 被引量:25
  • 2黄元元,何云峰.一种基于颜色特征的图像检索方法[J].中国图象图形学报,2006,11(12):1768-1773. 被引量:8
  • 3Han J W, Kamber M. Data mining: concepts and techniques [M]. San Francisco, US: Morgan Kaufmann, 2001.
  • 4Buyya R, Yeo C S, Venugopal S. Market-oriented cloud computing: vision,hype, and reality for delivering IT services as computing utilities, Keynote Paper [C] // Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications. Dalian, China, 2009 :25-27.
  • 5Armbrust M, Fox A. Above the clouds: a Berkeley view of cloud computing[R]. USA: University of California at Berkeley, 2009.
  • 6Erdogmus H. Cloud computing., does nirvana hide behind the nebula[J]. IEEE Software, 2009,26 (2) : 4-6.
  • 7Ghemawat S,Gobioff H, Leung S. The google file system[J].S ACM SIGOPS Operating Systems Review, 2003,37 (5) : 29-43.
  • 8Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters [C] /// Proceedings of Operating Systems Design and Implementation. San Franciseo, CA, 2004 : 137-150.
  • 9Xu X W, Jager J, Kriegel H P. A fast parallel clustering algorithm for large spatial databases[J]. Data Mining and Knowledge Discovery,1999,3(3) :263-290.
  • 10Savaresi S M, Boley D. On the Performance of Bisecting K-Means and PDDP[C]//Proc. of the 1st SIAM International Conference on Data Mining. Chicago, USA: [s. n.], 2001: 1-14.

共引文献187

同被引文献56

引证文献5

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部