期刊文献+

近似最近邻大数据检索哈希散列方法综述 被引量:4

Hashing for Approximate Nearest Neighbor Search on Big Data:A Survey
下载PDF
导出
摘要 近似最近邻检索已成为人工智能时代海量数据快速检索主要技术之一。作为高效的近似最近邻检索方法,哈希散列方法受到广泛关注并且层出不穷。到目前为止还没有文献对主流哈希散列方法进行全面地分析和总结。鉴于此,本文首先系统地介绍哈希散列的基本知识,包括距离计算、损失函数、离散约束和外样本计算等。然后,深入对比分析主流哈希散列算法优缺点,并在主流数据库上进行性能评估。最后,总结哈希散列技术目前存在的问题,并提出若干潜在的哈希散列研究方向。本文对设计高效的哈希散列方法具有重要借鉴意义。 Approximate Nearest Neighbor(ANN)search has served as one of the most important technologies for efficient retrieval of large-scale data in the era of artificial intelligence.As a promising solution to the ANN,hashing has received a lot of attention due to its high efficiency and extensive works have been presented in the literature.However,so far,there is no work with attempt to comprehensively analyze and overview the state-of-theart hashing methods.To address this,the basics of hashing,including distance calculation,loss function,discrete constraint and out-of-sample learning,are first systematically introduced.Then,the state-of-the-art hashing based methods are comparatively studied and experiments on the widely used databases are conducted to evaluate their performance.Finally,the key problems of hashing methods are summarized and some potential research directions are pointed out.It is believed that this endeavor could provide other researches with a useful guideline in designing effective and efficient hashing methods.
作者 费伦科 秦建阳 滕少华 张巍 刘冬宁 侯艳 Fei Lun-ke;Qin Jian-yang;Teng Shao-hua;Zhang Wei;Liu Dong-ning;Hou Yan(School of Computers,Guangdong University of Technology,Guangzhou 510006,China)
出处 《广东工业大学学报》 CAS 2020年第3期23-35,共13页 Journal of Guangdong University of Technology
基金 国家自然科学基金资助项目(61702110,61603100,61972102) 广东省自然科学基金资助项目(2019A1515011811) 广东省重点领域研发计划项目(2020B010166006)。
关键词 近似最近邻匹配 哈希学习 哈希散列 数据检索 approximate nearest neighbor search hashing learning hashing data retrieval
  • 相关文献

参考文献4

二级参考文献30

  • 1贺玲,吴玲达,蔡益朝.基于内容图像检索中的索引技术[J].计算机应用研究,2005,22(11):219-221. 被引量:7
  • 2Salton G, McGill M J. Introduction to Modem Information Retrieval.McGraw-Hill, 1983.
  • 3Robertson S, Sparck-Jones K. Relevance Weighting of Search Terms.Journal of American Society for Information Science, 1976, 3(27):129-146.
  • 4Deerwester S, Dumais S T, Furnas G W, et al. Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 1990, 41(6): 391-407.
  • 5Massot M, Rodriguez H, Ferres D. QA UdG-UPC System at TREC-12.In: Proceedings of the Twelfth Text Retrieval Conference, 2003:762.
  • 6KULIS B’JAIN P, GRAUMAN K. Fast similarity searchfor learned metrics [J]. IEEE Transactions on Pattern A-nalysis and Machine Intelligence,2009,31(12) : 2143-2157.
  • 7XU H,WANG J, LI Z,et al. Complementary hashing forapproximate nearest neighbor search[C]//2011 IEEE.In-ternational Conference on Computer Vision (ICCV). [S.1.]. IEEE, 2011:1631-1638.
  • 8BEYER K,GOLDSTEIN J,RAMAKRISHNAN R, et al.When is “nearest neighbor” meaningful? [M]//DatabaseTheory: ICDT,99. Berlin:Springer, 1999 : 217-235.
  • 9TORRALBA A, FERGUS R, WEISS Y_ Small codes andlarge image databases for recognition [C]//Proceedings ofthe IEEE Conference on Computer Vision and Pattern Rec-ognition. [S. 1. ]. IEEE, 2008:1-8.
  • 10NOROUZI M.PUNJANI A,FLEET D J. Fast exact searchin hamming space with multi-index hashing [ J ]. IEEETransactions on Pattern Analysis and Machine Intelli-gence,2014,36(6):1107-1119.

共引文献25

同被引文献8

引证文献4

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部