摘要
k最近邻搜索算法无法满足数据挖掘的分布性、实时性和可扩展性要求,针对该问题提出基于P2P的自适应分布式k最近邻搜索算法[0](P2PAKNNs)。阐述GHT*结构,定义高维数据相似度函数HDSF(X,Y),论述GHT*中的插入算法、范围查找算法和搜索算法。给出P2PAKNNs的实现过程,通过实验证明其正确性。
k-nearest Neighbor search algorithm(KNNs) can not satisfy the needs of distributing, real time performance and expansibility for data mining. Aiming at this problem, a P2P-based self-adaptive distributed KNNs(P2PAKNNs) is proposed. This paper expounds GHT* structure, and gives similarity measure function HDSF(X, Y). Insert algorithm, range find algorithm and search algorithm in GHT* are discussed. Implementation process of P2PAKNNs is given, and its correctness is validated by experiment.
出处
《计算机工程》
CAS
CSCD
北大核心
2009年第19期49-52,55,共5页
Computer Engineering
基金
湖北省教育厅基金资助项目"群体决策的知识发现与推理模型和方法研究"(2008d095)
关键词
k最近邻搜索算法
度量空间
相似性查询
k-nearest Neighbor search algorithm(KNNs)
metric space
similarity query