摘要
传统基于内容的图像检索方法通过相似度测量算法获取检索结果,对海量图像存在检索效率低和精度差的弊端,因此设计基于Hadoop分布式的海量图像检索方法,其基于Hadoop云平台对海量数码图像实施分布式运算,采集图像SURF特征,采用K-Means聚类方法将相似图像SURF特征聚集起来,通过TF-IDF数据挖掘技术对图像特征实施量化,进而基于Hadoop平台中的Lucene框架塑造海量图像数据的索引模块和搜索模块,依据用户输入的图像SURF特征塑造海量图像数据索引,完成相似图像的准确检索。实验结果说明,所提图像检索方法检索出的图像质量佳,对海量图像进行检索的效率和精度高。
The traditional content based image retrieval method obtains the retrieval results by means of similarity measure-ment algorithm,which has the disadvantages of poor retrieval accuracy and low retrieval efficiency for massive image.Therefore,a massive image retrieval method based on Hadoop distribution was designed to implement the distributed computing for massive digital image on the basis of its Hadoop cloud platform.The image SURF feature is acquired.And then the K-Means clustering method is used to assemble the SURF feature of similar images together.The TF-IDF data mining technology is used to quantify the image features,and then the index module and search module of massive image data are constructed on the basis of Lucene framework in Hadoop platform.According to the image SURF feature of user input,the data index of massive image was con-structed to retrieve the similar images accurately.The experimental results show that the image retrieval method has high retrieval image quality,and high retrieval efficiency and accuracy of massive image.
作者
王立
陈军峰
WANG Li;CHEN Junfeng(Open University of China,Beijing 100039,China;Beijing Stable Information Technology Limited Company,Beijing 100098,China)
出处
《现代电子技术》
北大核心
2018年第9期62-67,共6页
Modern Electronics Technique
基金
北京市科委重点专项课题:面向海量数据的智慧档案管理及三维可视化系统研发与应用示范(Z161100001116072)~~