Product quantization is now considered as an effective approach to solve the approximate nearest neighbor(ANN)search.A collection of derivative algorithms have been developed.However,the current techniques ignore the ...Product quantization is now considered as an effective approach to solve the approximate nearest neighbor(ANN)search.A collection of derivative algorithms have been developed.However,the current techniques ignore the intrinsic high order structures of data,which usually contain helpful information for improving the computational precision.In this paper,aiming at the complex structure of high order data,we design an optimized technique,called optimized high order product quantization(O-HOPQ)for ANN search.In O-HOPQ,we incorporate the high order structures of the data into the process of designing a more effective subspace decomposition way.As a result,spatial adjacent elements in the high order data space are grouped into the same subspace.Then,O-HOPQ generates its spatial structured codebook,by optimizing the quantization distortion.Starting from the structured codebook,the global optimum quantizers can be obtained effectively and efficiently.Experimental results show that appropriate utilization of the potential information that exists in the complex structure of high order data will result in significant improvements to the performance of the product quantizers.Besides,the high order structure based approaches are effective to the scenario where the data have intrinsic complex structures.展开更多
针对密度峰值聚类算法(clustering by fast search and find of density peaks,DPC)聚类无特定形状的实际数据集时聚类精度欠佳的问题,提出一种最优化密度估计的密度峰聚值类算法。使用最优Oracle逼近(Oracle approximating shrinkage,...针对密度峰值聚类算法(clustering by fast search and find of density peaks,DPC)聚类无特定形状的实际数据集时聚类精度欠佳的问题,提出一种最优化密度估计的密度峰聚值类算法。使用最优Oracle逼近(Oracle approximating shrinkage,AS)计算出最优协方差矩阵,利用最优协方差矩阵构造马氏距离,通过最优协方差矩阵提高DPC对数据相似度的区分能力,在此基础上结合K近邻算法,实现数据样本密度最优估计,利用最优密度估计提高DPC对实际数据集的聚类精度。在人工数据集和UCI真实数据集上进行仿真实验,实验结果表明,改进DPC算法的思路是可行的。展开更多
基金the National Natural Science Foundation of China(Grant No.61732011)Applied Fundamental Research Program of Qinghai Province(2019-ZJ-7017).
文摘Product quantization is now considered as an effective approach to solve the approximate nearest neighbor(ANN)search.A collection of derivative algorithms have been developed.However,the current techniques ignore the intrinsic high order structures of data,which usually contain helpful information for improving the computational precision.In this paper,aiming at the complex structure of high order data,we design an optimized technique,called optimized high order product quantization(O-HOPQ)for ANN search.In O-HOPQ,we incorporate the high order structures of the data into the process of designing a more effective subspace decomposition way.As a result,spatial adjacent elements in the high order data space are grouped into the same subspace.Then,O-HOPQ generates its spatial structured codebook,by optimizing the quantization distortion.Starting from the structured codebook,the global optimum quantizers can be obtained effectively and efficiently.Experimental results show that appropriate utilization of the potential information that exists in the complex structure of high order data will result in significant improvements to the performance of the product quantizers.Besides,the high order structure based approaches are effective to the scenario where the data have intrinsic complex structures.
文摘针对密度峰值聚类算法(clustering by fast search and find of density peaks,DPC)聚类无特定形状的实际数据集时聚类精度欠佳的问题,提出一种最优化密度估计的密度峰聚值类算法。使用最优Oracle逼近(Oracle approximating shrinkage,AS)计算出最优协方差矩阵,利用最优协方差矩阵构造马氏距离,通过最优协方差矩阵提高DPC对数据相似度的区分能力,在此基础上结合K近邻算法,实现数据样本密度最优估计,利用最优密度估计提高DPC对实际数据集的聚类精度。在人工数据集和UCI真实数据集上进行仿真实验,实验结果表明,改进DPC算法的思路是可行的。