摘要
密度峰值聚类算法(DPC)是近年来提出的一种新的密度聚类算法,其核心基于局部密度和相对距离。该算法在定义局部密度以及相对距离时直接用欧式距离,导致在一些稀疏差异大及长弧形的类簇聚类效果差,且一步分配策略的鲁棒性不佳。提出基于随机游走的密度峰值聚类算法(RW-DPC),即引入随机游走首次到达模型来刻画数据点之间的相似性,重新定义数据点的局部密度,且给出一种新的样本分配策略。对比在人工数据集和UCI真实数据集上与其他聚类算法的实验结果,说明对于密度不均匀及弧形类的数据集,本文算法的聚类效果优于密度峰值聚类算法以及其他算法。
Density peak clustering algorithm(DPC)is a new density clustering algorithm proposed in recent years.Its core is based on local density and relative distance.This algorithm directly uses Euclidean distance when defining local density and relative distance,resulting in poor clustering effect in some classes with large sparse differences and long arcs,and poor robustness of one-step allocation strategy.In this paper,a density peak clustering algorithm based on random walk(RW-DPC)was proposed,which introduced the random walk first arrival model to describe the similarity between data points,redefine the local density of data points,and a new sample allocation strategy was given.Compared with the experimental results of other clustering algorithms on artificial data set and UCI real data set,it was shown that the clustering effect of the proposed algorithm was better than that of other algorithms for the data set with uneven density and arc class.
作者
占志文
刘君
ZHAN Zhiwen;LIU Jun(School of Mathematics and Computer Sciences,Nanchang University,Nanchang 330031,China)
出处
《南昌大学学报(工科版)》
CAS
2022年第2期183-191,共9页
Journal of Nanchang University(Engineering & Technology)
基金
国家自然科学基金资助项目(72071099)。
关键词
密度峰值
聚类
随机游走
density peak
clustering
random walk