期刊文献+

基于边界点检测的变密度聚类算法 被引量:3

Varied density clustering algorithm based on border point detection
下载PDF
导出
摘要 密度聚类算法因具有对噪声鲁棒、能够发现任意形状的类等优点,得到了广泛的应用。然而,在实际应用中,这种算法面临着由于数据集中不同类的密度分布不均,且类与类之间的边界难以区分等导致聚类效果较差的问题。为解决以上问题,提出一种基于边界点检测的变密度聚类算法(VDCBD)。首先,基于给出的相对密度度量方法识别变密度类之间的边界点,以此增强相邻类的可分性;其次,对非边界区域的点进行聚类以找到数据集的核心类结构;接着,依据高密度近邻分配原则将检测到的边界点分配到相应的核心类结构中;最后,基于类结构信息识别数据集中的噪声点。在人造数据集和UCI数据集上与K-means、基于密度的噪声应用空间聚类(DBSCAN)算法、密度峰值聚类算法(DPCA)、有效识别密度主干的聚类(CLUB)算法、边界剥离聚类(BP)算法进行了比较分析。实验结果表明,所提算法可以有效解决类分布密度不均、边界难以区分的问题,并在调整兰德指数(ARI)、标准化互信息(NMI)、F度量(FM)、准确度(ACC)评价指标上优于已有算法;在运行效率分析中,当数据规模较大时,VDCBD运行效率高于DPCA、CLUB和BP算法。 The density clustering algorithm has been widely used because of its robustness to noise and the ability to find clusters of any shapes.However,in practical applications,this type of algorithms faces the problem of poor clustering effect due to the uneven distribution of the densities of different clusters in the dataset and the difficulty of distinguishing the borders between clusters.In order to solve the above problem,a Varied Density Clustering algorithm based on Border point Detection(VDCBD)was proposed.Firstly,the border points between varied density clusters were recognized based on the given relative density measurement method to enhance the separability of adjacent clusters.Secondly,the points in the non-border area were clustered to find the core class structures of the dataset.Secondly,the detected border points were allocated to the corresponding core class structures according to the principle of high-density neighbor allocation.Finally,the noise points in the dataset were recognized based on the class structure information.The proposed algorithm was compared and analyzed with the clustering algorithms such as K-means,Density-Based Spatial Clustering of Applications with Noise(DBSCAN)algorithm,Density Peaks Clustering Algorithm(DPCA),CLUstering based on Backbone(CLUB)algorithm,Border Peeling clustering(BP)algorithm on artificial datasets and UCI datasets.Experimental results show that the proposed algorithm can effectively solve the problems of uneven distribution of density and indistinguishable borders,and is superior to the existing algorithms on the evaluation indicators of Adjusted Rand Index(ARI),Normalized Mutual Information(NMI),F-Measure(FM),and Accuracy(ACC);in the analysis of operating efficiency,when the data size is relatively large,the operating efficiency of VDCBD is higher than those of DPCA,CLUB and BP algorithms.
作者 陈延伟 赵兴旺 CHEN Yanwei;ZHAO Xingwang(School of Computer and Information Technology,Shanxi University,Taiyuan Shanxi 030006,China;Key Laboratory Computational Intelligence and Chinese Information Processing of Ministry of Education(Shanxi University),Taiyuan Shanxi 030006,China)
出处 《计算机应用》 CSCD 北大核心 2022年第8期2450-2460,共11页 journal of Computer Applications
基金 国家自然科学基金资助项目(62072293)。
关键词 密度聚类 相对密度 变密度 边界点检测 噪声识别 density clustering relative density varied density border point detection noise recognition
  • 相关文献

参考文献5

二级参考文献18

共引文献279

同被引文献15

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部