摘要
为对多密度数据集聚类,提出一种基于密度可达的多密度聚类算法。使用网格划分技术来提高计算每个点密度值的效率,每次聚类都是从最高密度点开始,根据密度可达的概念和广度优先的策略逐步向外扩展进行聚类。实验表明,该算法能够有效地对任意形状、大小的均匀数据集和多密度数据集进行聚类,并能较好地识别出孤立点和噪声,其精度和效率优于SNN算法。
In order to cluster multi-density dataset, a clustering algorithm based on density-reachable for multi-density is proposed. Grid partition method is used to improve efficiency when computing each point's density. A clustering starts with the highest density point and uses expansion to form a cluster based on density-reachable and breadth-first strategy. Experimental results show that this algorithm can effectively discover clusters of arbitrary shapes for multi-density and uniformity density data sets with noises. It can get good cluster quality and is more efficient than SNN algorithm.
出处
《计算机工程》
CAS
CSCD
北大核心
2009年第17期66-68,共3页
Computer Engineering
基金
国家自然科学基金资助项目(60673087)
郑州大学骨干教师基金资助项目
关键词
聚类算法
邻域网格
密度可达
广度优先
多密度
clustering algorithm
neighborhood grid
density-reachable
breadth-first
multi-density