摘要
DBSCAN已被广泛应用到计算机视觉处理及图像处理中的数据压缩和信息检索等领域。论文针对DBSCAN算法在数据分布不均匀时,使用全局阈值难以识别数据集中所有簇的问题,提出基于网格划分和密度比聚类的DBSCAN算法。该算法首先通过自适应多分辨率的网格划分思想把数据划分到多个网格空间中,利用所划分的网格加快查找到类簇的峰值和低谷;再利用密度估计来计算密度,从而快速确定全局阈值,并使用该全局阈值对数据集进行有效识别。通过对比实验表明,所提算法能够有效对密度不均匀的数据进行聚类,并具有较高的效率。
DBSCAN has been widely used in the fields of data compression and information retrieval in computer vision processing and image processing.This paper presents a DBSCAN algorithm based on grid partition and density ratio clustering to solve the problem that it is difficult to identify all clusters in a data set by using global threshold when the data distribution is uneven.Firstly,the algorithm divides the data into several grid spaces by the idea of adaptive multi-resolution mesh partitioning,and uses the partitioned grid to find the peak and trough of the cluster quickly.Secondly,density estimation is used to calculate the density,so as to quickly determine the global threshold and effectively identify the data set.The experimental results show that the proposed algorithm can effectively cluster the data with uneven density and has high efficiency.
作者
徐红艳
普蓉
黄法欣
王嵘冰
XU Hongyan;PU Rong;HUANG Faxin;WANG Rongbing(College of Information,Liaoning University,Shenyang 110036)
出处
《计算机与数字工程》
2020年第6期1269-1274,1285,共7页
Computer & Digital Engineering
基金
中国博士后科学基金项目(编号:2018M631814)
辽宁省社科规划基金项目(编号:L18AGL007)资助。