摘要
谱聚类将数据聚类问题转化成图划分问题,通过寻找最优的子图,对数据点进行聚类。谱聚类的关键是构造合适的相似矩阵,将数据集的内在结构真实地描述出来。针对传统的谱聚类算法采用高斯核函数来构造相似矩阵时对尺度参数的选择很敏感,而且在聚类阶段需要随机确定初始的聚类中心,聚类性能也不稳定等问题,本文提出了基于消息传递的谱聚类算法。该算法采用密度自适应的相似性度量方法,可以更好地描述数据点之间的关系,然后利用近邻传播(Affinity propagation,AP)聚类中“消息传递”机制获得高质量的聚类中心,提高了谱聚类算法的性能。实验表明,新算法可以有效地处理多尺度数据集的聚类问题,其聚类性能非常稳定,聚类质量也优于传统的谱聚类算法和k means算法。
Spectral clustering transforms data clustering problem into a graph partitioning problem and classifies data points by finding the optimal sub-graphs.The key to spectral clustering is constructing a suitable similarity matrix,which can truly describe the intrinsic structure of the dataset.However,traditional spectral clustering algorithms adopt Gaussian kernel function to construct the similarity matrix,which results in their sensitivity of selection for scale parameter.In addition,the initial cluster centers need randomly determing at the clustering stage and the clustering performance is not stable.The paper presents an algorithm based on message passing.The algorithm uses a density adaptive similarity measure,which can well describe the relations between data points,and it can obtain high-quality cluster centers through message passing mechanism in affinity propagation(AP)clustering.Moreover,the performance of clustering is optimized by the method.Experiments show that the proposed algorithm can effectively deal with the clustering problem of multi-scale datasets.Its clustering performance is very stable,and the clustering quality is better than traditional spectral clustering algorithm and k-means algorithm.
作者
王丽娟
丁世飞
贾洪杰
Wang Lijuan;Ding Shifei;Jia Hongjie(School of Computer Science and Technology,China University of Mining and Technology,Xuzhou,221116,China;School of Information and Electrical Engineering,Xuzhou College of Industrial Technology,Xuzhou,221400,China;School of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang,212013,China)
出处
《数据采集与处理》
CSCD
北大核心
2019年第3期548-557,共10页
Journal of Data Acquisition and Processing
基金
国家自然科学基金(61676522,61379101)资助项目
徐州市科技发展基金(KC17132)资助项目
关键词
谱聚类
相似矩阵
消息传递
聚类稳定性
spectral clustering
similarity matrix
message passing
clustering stability