期刊文献+

近似最近邻搜索算法——位置敏感哈希 被引量:8

Approximate Nearest Neighbor Searching Algorithm—Locality Sensitive Hashing
下载PDF
导出
摘要 寻找查询点的最近邻是信息处理相关领域的主要任务之一。在数据规模较大时需要采用快速检索算法,常用的快速检索算法主要是基于树的算法,但是当数据点维数较高时,这些算法的效率会变低。位置敏感哈希是当前解决高维搜索的最快的算法,文章对汉明空间、欧式空间下的位置敏感哈希算法的实现方案进行了详细分析,对算法中数据点冲突概率、空间时间消耗、参数调整对算法性能的影响进行了详尽的研究和试验,最后讨论算法的优点和缺点,说明了算法应用于视觉聚类的可能性。 Finding nearest neighbor is a main task in information processing field. The fast searching algorithm is needed in large scale database, and tree-based methods are frequently used for fast retrieval. But when the dimension of data point is high, they will become inefficient. Locality Sensi- tive Hashing is the fastest method for solving fast high dimension searching currently. This paper ex- plores the implementation of Locality Sensitive Hashing in hamming space and Euclidean space, and studies the data point collision probability, space and time consuming, the effect of parameter tuning through experiments. Finally discussed are the merits and drawbacks of this algorithm and the feasi- bility of applying LSH in visual clustering.
机构地区 信息工程大学 [
出处 《信息工程大学学报》 2013年第3期332-340,共9页 Journal of Information Engineering University
基金 国家自科学基金资助项目(60872142)
关键词 近似最近邻搜索 位置敏感哈希 精确欧式距离位置敏感哈希 视觉聚类 approximate nearest neighbor(ANN) locality sensitive Hashing(LSH) exact euclide- an locality sensitive Hashing( E2LSH) visual clustering
  • 相关文献

参考文献41

  • 1lndyk P, Motwani R. Approximate nearest neighbors: towards removing the curse of dimensionality[ C]//The Symposium on Theory of Computing. 1998:604-613.
  • 2Gionis A, Indyk P, Motwani R. Similarity search in high dimensions via hashing[ C ]//The 25th International Conference on Very Large Data Bases. 1999:518-529.
  • 3Datar M, Immorlica N, Indyk P, et al. Locality sensitive hashing scheme based on p-stable distributions[ C ]//The ACM Sym- posium on Computational Geometry. 2004: 253-262.
  • 4Andoni A, Indyk P. E21sh: Exact Euclidean locality-sensitive hashing (E^2LSH 0.1 User Manual) [ EB/OL]. [2005-06-01 ]. http-//www, mit. edu/- andoni/LSH/manual, pdf, October 20,2011.
  • 5Andoni A, Indyk P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions[ J]. Communications of the ACM, 2008,51(1) :117-122.
  • 6Panigrahy R. Entropy-based nearest neighbor algorithm high dimensions[ C]//The ACM-SIAM Symposium on Discrete Algo-rithms. 2006:1185-1195.
  • 7Buhler J, Tompa M. Finding motifs using random projections[ J]. Journal of Computational Biology, 2002 , 9(2) :225-242.
  • 8Shakhnarovich G, Viola P, Darrell T. Fast pose estimation with parameter-sensitive hashing[ C] //The 9th IEEE InternationalConference on Computer Vision. 2003 : 13-16.
  • 9Casey M,Slaney M. Fast recognition of remixed music audio[ C]//The IEEE International Conference on Acoustics, Speech,and Signal Processing. 2007 : 1425-1428 .
  • 10Liang Ying Yu, Li Jian Min, Zhang Bo. Vocabulary-based Hashing for Image Search[ C]//The International Conference onMultimedia. 2009 :589-592.

二级参考文献123

  • 1胡和平,曾庆锐,路松峰.中文词聚类研究[J].计算机工程与科学,2006,28(1):122-124. 被引量:9
  • 2卢炎生,饶祺.一种LSH索引的自动参数调整方法[J].华中科技大学学报(自然科学版),2006,34(11):38-40. 被引量:6
  • 3梅翔,孟祥武,陈俊亮,徐萌.一种基于语义关联的查询优化方法[J].北京邮电大学学报,2006,29(6):107-110. 被引量:10
  • 4Stein B. Principles of hash - based text retrieval [C]//Annual ACM Conference on Research and Development in Information Retrieval Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007.
  • 5Athitsos V,Potamias M,Papapetrou P,et al. Nearest Neighbor Retrieval Using Distanee-Based Hashing[C] // Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on. 2008.
  • 6IndykP, DatarM, ImmorlicaN. Locality-SensitiveHashingScheme Based on p-Stable[C]//Annual Symposium on Computational Geometry. 2004.
  • 7Arya S, Mount D. Ann: Library for approximate nearest neighbor search[OL], http: //www. cs. umd. edu/-mount/ANN/.
  • 8Indyk P, Motwani R. Approximate nearest neighbors : Towards removing the curse of dimensionality[C]//Jeffrey V, ed. Proc. of the 30th Annual ACM Symp. on Theory of Computing. New York: ACM Press, 1998 : 604-613.
  • 9Panigrahy R. Entropy based nearest neighbor searchin high dimensions[C]//Proc, of ACM-SIAMSymposium on Discrete Algorithms(SODA). 2006.
  • 10Ravichandran D,Pantel P, Hovy E. Randomized Algorithms and NLP..Using Locality Sensitive Hash Function for High Speed Noun Clustering[M]. Information Sciences Institute University of Southern California, 2004.

共引文献70

同被引文献64

  • 1刘小珠,孙莎,曾承,彭智勇.基于缓存的倒排索引机制研究[J].计算机研究与发展,2007,44(z3):153-158. 被引量:8
  • 2郑坤,朱良峰,吴信才,刘修国,李菁.3D GIS空间索引技术研究[J].地理与地理信息科学,2006,22(4):35-39. 被引量:33
  • 3侯剑华,陈悦.战略管理学前沿演进可视化研究[J].科学学研究,2007,25(A01):15-21. 被引量:136
  • 4BREESE J, HECHERMAN D, KADIE C. Empirical analysis of predictive algorithms for collaborative filtering[ C] // Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers, 1998:43 -52.
  • 5ZHANG S, WANG W, FORD J, et al. Learning from incomplete ratings using non-negative matrix factorization[ C]// proceedings of the 6th SIAM Conference on Data Mining. Philadelphia: SIAM, 2006:549-553.
  • 6CAI R, ZHANG C, ZHANG L, et al. Scalable Music recommenda- tion by search[ C]// Proceedings of the 15th ACM International Conference on Multimedia. New York: ACM, 2007:1065 - 1074.
  • 7DATAR M, IMMORLICA N, INDYK P, et al. Locality-sensitive hashing scheme based on p-stable distributions[ C}//Proceedings of the 20th ACM Symposium on Computational Geometry. New York: ACM, 2004:253-262.
  • 8MAREE R, DENIS P, WEHENKEL L. Incremental indexing and distributed image search using shared randomized vocabularies [ C]//Proceedings of the 2007 ACM SIGMM International Confer- ence on Very Large Data Bases. New York: ACM, 2007:950 - 961.
  • 9ANDONI A, INDYK P. Nearest-optimal hashing algorithms for ap- proximate nearest neighbor in high dimensions[ J]. Communications oftheACM, 2008,51(1): 117 -122.
  • 10ANDONI A, INDYK P. E2LSH: Exact Euclidean Locality-Sensi- tive Hashing (E2LSH) 0.1 user manual[ EB/OL]. [ 2005 - 06 - 01]. http://www, mit. edu/ andoni/ LSH/manual. pdf.

引证文献8

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部