期刊文献+

不确定数据信任密度峰值聚类算法 被引量:1

Belief Density Peak Clustering Algorithm for Uncertain Data
原文传递
导出
摘要 密度峰值聚类算法具有简单高效、无需迭代计算和提前设定类簇数的优势,但是在划分非类中心样本时容易产生“多米诺骨牌”效应,并且不能准确划分重叠区域的样本和噪声。为了解决以上问题,提出了不确定数据信任密度峰值聚类算法。首先,该算法在密度峰值聚类算法获取类中心样本的基础上,利用非类中心样本的K近邻求出样本属于不同类的信任值,将样本划分到信任值最大的类别,得到基于K近邻的初步聚类结果。然后,计算关于密度的上分位数得到密度阈值,在证据推理框架下进行信任划分,将密度小于该阈值的孤立样本划分到噪声类;处于重叠部分的样本划分到相关单类组成的复合类;信任值强烈支持属于某个类别的样本划分到相应的单类。该算法通过引入复合类和噪声类能够更加准确地展现样本在现有属性信息下的不确定性。实验结果表明,该算法在人工数据集和UCI数据集上相比于其他对比算法,能够取得更好的聚类性能。 The density peak clustering algorithm is simple and efficient and does not require iterative calculations.It has the advantages of setting the number of clusters in advance,but it is easy to produce a“domino”effect when dividing non-centered samples.Moreover,it cannot accurately partition the samples and noise in the overlapping area.To solve the above problems,the belief density peak clustering algorithm for uncertain data is proposed.First,the algorithm uses the K-nearest neighbors of non-class center samples to determine the degree of belief of the samples belonging to different clusters based on the density peak clustering algorithm so as to obtain the cluster center samples and partition the samples into a meta-cluster with the largest degree of belief to obtain the preliminary clustering results of K-nearest neighbors.Then,the upper quantile of the density is calculated to obtain the density threshold and credal partition under the framework of evidence reasoning,and isolated samples whose density is less than the threshold are classified into the noise cluster.Afterward,the samples in the overlapping part are partitioned into the composite cluster composed of related single clusters.The degree of belief strongly supports the classification of samples belonging to a certain cluster into the corresponding single cluster.The algorithm introduces the composite cluster and noise cluster to accurately show the uncertainty of the sample under the existing attribute information.Experimental results show that this algorithm can achieve better clustering performance compared with other algorithms on artificial and UCI datasets.
作者 汪康 马宗方 田鸿朋 宋琳 WANG Kang;MA Zongfang;TIAN Hongpeng;SONG Lin(Xi'an University of Architecture and Technology,College of Information and Control Engineering,Xi'an 710055,China)
出处 《信息与控制》 CSCD 北大核心 2022年第3期349-360,共12页 Information and Control
基金 国家重点研发计划(2019YFC1907105) 陕西省重点研发计划(2020GY-186,2020SF-367)。
关键词 聚类 密度峰值 K近邻 证据推理 信任划分 clustering density peak K-nearest neighbors(KNN) evidential reasoning credal partition
  • 相关文献

参考文献13

二级参考文献73

  • 1唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量:95
  • 2符冰,方宗德,侯宇.微型扑翼飞行器控制系统的研究现状[J].飞行力学,2005,23(2):15-18. 被引量:5
  • 3尹慧琳,王磊.D-S证据推理改进方法综述[J].计算机工程与应用,2005,41(27):22-24. 被引量:26
  • 4Shafer G.A Mathematical Theory of Evidence[M].Princeton:Princeton University,1976.
  • 5Sidenbladh H,Svenson P,Schubert J.Comparing multi-target trackers on different force unit levels[A].Sisal Processing,Sensor Fusion,and Target Recognition XIII[C].Bellingham,USA:SPIE,2004.306 ~314.
  • 6Bergsten U,Schubert J,Svensson P.Applying data mining and machine learning techniques to submarine intelligence analysis[A].Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining[C].Menlo Park,CA,USA:AAAI Press,1997.127 ~ 130.
  • 7Schubert J.On nonspecific evidence[J].International Journal of Intelligent Systems,1993,8(6):711 ~725.
  • 8Schubert J.Specifying nonspecific evidence[J].Internationalc Journal of Intelligent Systems,1996,11 (8):525 ~ 563.
  • 9RossTJ 钱同惠 沈其聪.模糊逻辑及其工程应用[M].北京:电子工业出版社,2001..
  • 10杨燕,靳蕃,KAMEL Mohamed.聚类有效性评价综述[J].计算机应用研究,2008,25(6):1630-1632. 被引量:117

共引文献340

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部