基于KD-树与差分隐私保护的空间数据分割得到了研究者的广泛关注,空间数据的大小与拉普拉斯噪音的多少直接制约着空间分割的精度。针对现有基于KD-树分割方法难以有效兼顾大规模空间数据与噪音量不足的问题,提出了一种满足差分隐私的KD...基于KD-树与差分隐私保护的空间数据分割得到了研究者的广泛关注,空间数据的大小与拉普拉斯噪音的多少直接制约着空间分割的精度。针对现有基于KD-树分割方法难以有效兼顾大规模空间数据与噪音量不足的问题,提出了一种满足差分隐私的KD-树分割方法 SKD-Tree(sampling-based KD-Tree)。该方法利用满足差分隐私的伯努利随机抽样技术,抽取空间样本作为分割对象,然而却没有摆脱利用树高度控制拉普拉斯噪音。启发式设定合适的树高度非常困难,树高度过大,导致结点的噪音值过大;树高度过小,导致空间分割粒度太粗劣。为了弥补SKD-Tree方法的不足,提出了一种基于稀疏向量技术(sparse vector technology,SVT)的空间分割方法 KD-TSS(KD-Tree with sampling and SVT)。该方法通过SVT判断树中结点是否继续分割,不再依赖KD-树高度来控制结点中的噪音值。SKD-Tree、KD-TSS与KD-Stand、KD-Hybrid在真实的大规模空间数据集上实验结果表明,其分割精度以及响应范围查询效果优于同类算法。展开更多
The paper analyze and improve the SIFT optimized algorithm, and proposes an image matching method for SIFT algorithm based on quasi Euclidean distance and KD-tree. Experiments show that this algorithm has matching mor...The paper analyze and improve the SIFT optimized algorithm, and proposes an image matching method for SIFT algorithm based on quasi Euclidean distance and KD-tree. Experiments show that this algorithm has matching more points, high matching accuracy, no repealed points and higher advantage of matching efficiency based on keeping the basic characteristics of SIFT algorithm unchanged, and provides precise matching point to generate precise image stitching and other related fields of the follow-up product. At the same time, this method was applied to the layout optimization and achieved good results.展开更多
We present a study to show the possibility of using two well-known space partitioning and indexing techniques, kd trees and quad trees, in declustering applications to increase input/output (I/O) paraUelization and ...We present a study to show the possibility of using two well-known space partitioning and indexing techniques, kd trees and quad trees, in declustering applications to increase input/output (I/O) paraUelization and reduce spatial data processing times. This parallelization enables time-consuming computational geometry algorithms to be applied efficiently to big spatial data rendering and querying. The key challenge is how to balance the spatial processing load across a large number of worker nodes, given significant performance heterogeneity in nodes and processing skews in the workload.展开更多
文摘为解决现有基于网格结构的差分隐私二维空间数据划分发布方法可能引起局部划分过细导致查询精度低的问题,提出了基于kd-树的差分隐私二维空间数据划分发布方法—kd-PPDP算法(differentially privacy partitioning publication algorithm based on kd-tree)。算法采用了kd-树算法思想,通过启发式地识别网格化后数据分布情况并合并相邻近似网格单元来防止局部划分过细问题,从而减少所添加的噪声,提高查询精度。通过实验对比分析了kd-PPDP算法与现有基于网格结构的划分发布方法的查询误差以及时间效率,结果表明了该算法的有效性和可行性。
文摘基于KD-树与差分隐私保护的空间数据分割得到了研究者的广泛关注,空间数据的大小与拉普拉斯噪音的多少直接制约着空间分割的精度。针对现有基于KD-树分割方法难以有效兼顾大规模空间数据与噪音量不足的问题,提出了一种满足差分隐私的KD-树分割方法 SKD-Tree(sampling-based KD-Tree)。该方法利用满足差分隐私的伯努利随机抽样技术,抽取空间样本作为分割对象,然而却没有摆脱利用树高度控制拉普拉斯噪音。启发式设定合适的树高度非常困难,树高度过大,导致结点的噪音值过大;树高度过小,导致空间分割粒度太粗劣。为了弥补SKD-Tree方法的不足,提出了一种基于稀疏向量技术(sparse vector technology,SVT)的空间分割方法 KD-TSS(KD-Tree with sampling and SVT)。该方法通过SVT判断树中结点是否继续分割,不再依赖KD-树高度来控制结点中的噪音值。SKD-Tree、KD-TSS与KD-Stand、KD-Hybrid在真实的大规模空间数据集上实验结果表明,其分割精度以及响应范围查询效果优于同类算法。
文摘The paper analyze and improve the SIFT optimized algorithm, and proposes an image matching method for SIFT algorithm based on quasi Euclidean distance and KD-tree. Experiments show that this algorithm has matching more points, high matching accuracy, no repealed points and higher advantage of matching efficiency based on keeping the basic characteristics of SIFT algorithm unchanged, and provides precise matching point to generate precise image stitching and other related fields of the follow-up product. At the same time, this method was applied to the layout optimization and achieved good results.
文摘We present a study to show the possibility of using two well-known space partitioning and indexing techniques, kd trees and quad trees, in declustering applications to increase input/output (I/O) paraUelization and reduce spatial data processing times. This parallelization enables time-consuming computational geometry algorithms to be applied efficiently to big spatial data rendering and querying. The key challenge is how to balance the spatial processing load across a large number of worker nodes, given significant performance heterogeneity in nodes and processing skews in the workload.