摘要
对密度分布不均匀的数据采用近邻传播的谱聚类,存在误将不同类的样本传入同一高相似度的子集中的情况,因而得不到真实的相似度矩阵和准确的聚类结果.针对这一问题,提出一种基于局部密度估计和近邻关系传播的谱聚类(LDENP-SC)算法.该算法首先对样本进行密度估计并升维,然后对新数据采用传播算法更新相似度矩阵并谱聚类.在计算密度时提出一种简易的局部密度计算方法,该方法既能反应样本的密度又能减少运算时间;在更新相似度矩阵时基于传播算法提出一种更新子集间样本相似性的方法,使更新后样本的相似度更接近实际.实验结果表明,LDENP-SC算法能够得出取得理想的相似度矩阵和准确的聚类结果,具有较好的泛化能力,且对一定范围内的参数σ表现出鲁棒性.
Neighbor propagation based spectral clustering can be used to cluster the dataset with inhomogeneous density. However,sometimes it propagates different clustering samples into the same subset with high similarity,which can not obtain the real similarity matrix and accurate clustering results. To solve this problem,a local density estimation and neighbor propagation based spectral clustering algorithm(LDENPSC) is proposed. In this algorithm,the local density of the samples is firstly estimated and the dimensions of the datasets are increased. Then,the similarity matrix is updated by using neighbor propagation and the new dataset is clustered by spectral clustering. Also,a simple local density estimation method is proposed by with the local density of the samples can be estimated accurately and fast. Moreover,based on propagation algorithm,a method for updating the similarity of the samples in different subsets is adopted to get more actual similarity matrix. The experimental results show that LDENP-SC algorithm can obtain similarity matrix close to the ideal and accurate clustering results,has good generalization ability and is robust to a certain range of parameter σ.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2014年第9期856-864,共9页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.60975027
61305017)
江苏高校优势学科建设工程项目资助
关键词
谱聚类
密度估计
近邻关系传播
相似度矩阵
Spectral Clustering
Density Estimation
Neighbor Propagation
Similarity Matrix