摘要
为了提高基于密度聚类算法的效率,避免算法在执行过程中的多余搜索,提出了一种基于DBSCAN算法的改进的空间数据聚类算法。该算法采用对象邻域空间进行划分的方法,将网格索引结构应用于该算法。在核心对象的邻域内选择八个方向上未标记且距离核心对象最边缘的对象来扩展种子对象,减少查询次数,降低聚类的时间复杂度。在实验中,利用海量数据集对算法进行测试,测试结果证明新算法在保证聚类精度的情况下时间效率显著高于DBSCAN算法。
In order to improve the efficiency of clustering algorithm based on density and avoid redundant search in processing, the paper puts forward an improved spatial data clustering algorithm based on DBSCAN.The algorithm uses the method of object's neighborhood-spatial segmentation,and makes use of index of gridding structure.In core points' neighborhood,the objects without mark which lie in eight aspects and have the biggest distance from core objects are chose to expand seed objects.In the case,the times of query is decreased,and the time complexity of clustering is reduced.In experiment,mass data is used to test the algorithm, which proves that the new algorithm's time efficiency is much better than DBSCAN in the same clustering precision.
出处
《计算机工程与应用》
CSCD
北大核心
2008年第16期139-141,共3页
Computer Engineering and Applications
基金
国家高技术研究发展计划(863)(the National High-Tech Research and Development Plan of China under Grant No.2003AA41250)
辽宁省教育厅A类基金(No.20243303)
关键词
DBSCAN
网格索引
空间数据
聚类
Density Based Spatial Clustering of Application with Noise(DBSCAN)
index of gridding
spatial data
clustering