摘要
针对在数据分布不均匀时,由于DBSCAN使用统一的全局变量,使得聚类的效果差,提出了一种基于过滤的DBSCAN算法。该算法的思想是:在调用传统的DBSCAN算法前,先对数据集进行预处理,针对所有点的k-dist数据进行一维聚类,自动计算出不同的Eps;然后再根据每个Eps分别调用传统的DBSCAN算法,从而找出非均匀数据集的各种聚类。实验结果表明,改进算法对密度不均匀的数据能够有效聚类。
When data distribution was not even, DBSCAN was clustering quality degrades for using the same global variable. This paper proposed a filtration-based DBSCAN algorithm. The basic idea of the algorithm was that, before adopting traditional DBSCAN algorithm, according to the data point' s k-dist plot, using 1-dimension clustering to get all the clusters, then getting several values of parameter Eps for different densities. With different values of Eps, adopted DBSCAN algorithm in order to find out clusters with varied densities simultaneity. The experimental result demonstrates that the improved algorithm is effective on clustering the datasets with varied densities.
出处
《计算机应用研究》
CSCD
北大核心
2009年第10期3721-3723,共3页
Application Research of Computers
基金
重庆市科委自然科学基金计划资助项目(2007BB2372)