期刊文献+

面向密度分布不均数据的近邻优化密度峰值聚类算法 被引量:3

Density peaks clustering algorithm with nearest neighbor optimization for data with uneven density distribution
原文传递
导出
摘要 密度分布不均数据是指类簇间样本分布疏密程度不同的数据.密度峰值聚类(DPC)算法在处理密度分布不均数据时,倾向于在密度较高区域内找到类簇中心,并易将稀疏类簇的样本分配给密集类簇.为避免上述缺陷,提出一种面向密度分布不均数据的近邻优化密度峰值聚类(DPC-NNO)算法. DPC-NNO算法结合逆近邻和k近邻定义新的局部密度,提高稀疏样本的局部密度,使算法能更准确地找到类簇中心;定义分配策略时引入共享近邻,计算样本间相似性,构造相似矩阵,使同一类簇样本联系更紧密,避免错误分配样本.将所提出的DPC-NNO算法与IDPC-FA、DPCSA、FNDPC、FKNN-DPC、DPC算法进行对比,实验结果表明, DPC-NNO算法在处理密度分布不均数据时能获得优异的聚类效果,对于复杂数据集和UCI数据集, DPC-NNO算法的综合性能优于对比算法. Data with uneven density distribution are those where the distribution of samples varies in sparsity between class clusters.When dealing with uneven density datasets,the density peak clustering(DPC)algorithm tends to find the center of class clusters in the higher density area and assign samples from sparse class clusters to dense class clusters.To avoid these defects,this paper proposes a density peaks clustering algorithm with nearest neighbor optimization(DPCNNO)for data with uneven density distribution.The DPC-NNO algorithm combines the reverse nearest neighbor and k-nearest neighbor to define a new local density that improves the local density of sparse samples,allowing the algorithm to find class cluster centers more accurately;shared nearest neighbors are introduced to define the assignment strategy to calculate the similarity between samples and construct a similarity matrix to make the samples of the same class clusters more closely related and avoid the wrong assignment of samples.In this paper,we compare the DPC-NNO algorithm with IDPC-FA,DPCSA,FNDPC,FKNN-DPC,and DPC algorithms.Experimental results show that the DPC-NNO algorithm can achieve excellent clustering results on uneven density datasets,and the comprehensive performance of the DPC-NNO algorithm is better than other comparison algorithms on complex datasets and UCI datasets.
作者 陈蔚昌 赵嘉 肖人彬 王晖 崔志华 CHEN Wei-chang;ZHAO Jia;XIAO Ren-bin;WANG Hui;CUI Zhi-hua(School of Information Engineering,Nanchang Institute of Technology,Nanchang 330099,China;School of Artificial Intelligence and Automation,Huazhong University of Science and Technology,Wuhan 430074,China;School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China)
出处 《控制与决策》 EI CSCD 北大核心 2024年第3期919-928,共10页 Control and Decision
基金 国家自然科学基金项目(52069014,51669014) 科技创新—–2030“新一代人工智能”重大项目(2018AAA0101200)。
关键词 密度峰值 聚类分析 密度分布不均 逆近邻 共享近邻 样本相似性 density peaks clustering analysis uneven density distribution reverse nearest neighbor share nearest neighbor similarity of samples
  • 相关文献

参考文献4

二级参考文献17

共引文献42

同被引文献25

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部