期刊文献+

基于k-means特征的适应性近似最近邻搜索算法

Adaptive approximate nearest neighbor search algorithm based on k-means features
下载PDF
导出
摘要 在基于倒排索引和HNSW索引结构的最近邻搜索算法中,由于所有查询点使用固定的终止条件进行近似最近邻搜索,从而导致某些查询点在搜索路径上访问了不必要的数据点。因此,本文针对十亿规模数据集,在IVF-HNSW算法的基础上,根据数据点的k-means特征和真实最小访问点,建立神经网络回归模型。通过模型,动态预测每个查询点在HNSW索引中找到最近邻所需要搜索的质心个数,以及在IVF中需要搜索的倒排列表的个数,最终每个查询点能够通过适应性搜索,减少需要访问的数据库向量的个数,进而降低总体搜索所需要的查询时间。实验结果表明,优化后的自适应搜索算法与原始IVF-HNSW算法相比,在最高召回率下,平均查询时间最多可降低27%。 In the nearest neighbor search algorithm based on the inverted index and HNSW index structure,because all query points use a fixed termination condition to search the nearest neighbor,some query points visit unnecessary data points on the search path.Therefore,for the billion-scale data set,based on the IVF-HNSW algorithm,this paper establishes a neural network regression model according to the k-means characteristics of the data points and the real minimum access point and dynamically predicts each query point through the model to find the nearest neighbor in the HNSW index.The number of centroids to search and the number of inverted lists to search in IVF.Finally,each query point can be adaptively searched to reduce the number of database vectors that need to be accessed,thereby reducing the query time required for the overall search.The experimental results show that the optimized adaptive search algorithm can reduce the average query time by up to 27%at the highest recall rate compared with the original IVF-HNSW algorithm.
作者 胡文洁 杨凯祥 谭宗元 HU Wenjie;YANG Kaixiang;TAN Zongyuan(School of Computer Science and Technology,Donghua University,Shanghai 201620,China)
出处 《智能计算机与应用》 2023年第9期80-84,共5页 Intelligent Computer and Applications
关键词 最近邻搜索 倒排索引 HNSW索引 适应性搜索 nearest neighbor search inverted index HNSW index adaptive search
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部