期刊文献+

基于相对密度的密度峰值聚类算法 被引量:3

Density Peak Clustering Algorithm Based on Relative Density
下载PDF
导出
摘要 密度峰值聚类算法在处理密度不均匀的数据集时易将低密度簇划分到高密度簇中或将高密度簇分为多个子簇,且在样本点分配过程中存在误差传递问题。提出一种基于相对密度的密度峰值聚类算法。引入自然最近邻域内的样本点信息,给出新的局部密度计算方法并计算相对密度。在绘制决策图确定聚类中心后,基于对簇间密度差异的考虑,提出密度因子计算各个簇的聚类距离,根据聚类距离对剩余样本点进行划分,实现不同形状、不同密度数据集的聚类。在合成数据集和真实数据集上进行实验,结果表明,该算法的FMI、ARI和NMI指标较经典的密度峰值聚类算法和其他3种聚类算法分别平均提高约14、26和21个百分点,并且在簇间密度相差较大的数据集上能够准确识别聚类中心和分配剩余的样本点。 When the density peak clustering algorithm deals with datasets with uneven density,it is easy to divide the low-density clusters into high-density clusters,divide the high-density clusters into multiple sub-clusters,and exists the error propagation occurs in the process of sample point allocation.To solve these problems,a density peak clustering algorithm based on relative density is proposed.The proposed algorithm introduces sample point information in the natural nearest neighborhood,provides a new local density calculation method,and calculates the relative density.After drawing a decision diagram to determine the cluster centers,considering the density difference between clusters,a density factor is proposed to calculate the clustering distance of each cluster,and the remaining sample points are divided according to the clustering distance,then the proposed algorithm clusters datasets with different shapes and densities.Experiments are performed on synthetic and real datasets for comparison with the classical density peak clustering algorithm and three other clustering algorithms.The results show that the proposed algorithm increases the Fowlkes and Mallows Index(FMI),Adjusted Rand Index(ARI),and Normalized Mutual Information(NMI)by an average of approximately 14,26,and 21 percentage points,respectively.At the same time,the proposed algorithm has great advantages in accurately identifying cluster centers and assigning the remaining sample points to datasets with large differences in densities between clusters.
作者 位雅 张正军 何凯琳 唐莉 WEI Ya;ZHANG Zhengjun;HE Kailin;TANG Li(School of Mathematics and Statistics,Nanjing University of Science and Technology,Nanjing 210094,China;School of Information Engineering,Jingdezhen University,Jingdezhen 333000,Jiangxi,China)
出处 《计算机工程》 CAS CSCD 北大核心 2023年第6期53-61,共9页 Computer Engineering
基金 国家自然科学基金(61773014)。
关键词 聚类 密度峰值 相对密度 密度因子 聚类距离 自然最近邻 clustering density peak relative density density factor clustering distance natural nearest neighborhood
  • 相关文献

参考文献7

二级参考文献37

共引文献46

同被引文献29

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部