摘要
对于一个K类问题,Ng-Jordan-Weiss(NJW)谱聚类算法通常采用数据规范化亲和度矩阵的前K个最大特征值对应的特征向量作为数据的一种表示.然而,对于某些模式识别问题,这K个特征向量不一定能够体现原始数据的结构.文中提出一种半监督谱聚类特征向量选择算法.该算法利用一定量的监督信息寻找能够体现数据结构的特征向量组合,进而获得优于传统谱聚类算法的聚类性能.UCI标准数据集和MNIST手写体数据集上的仿真实验验证该算法的有效性和鲁棒性.
For a K clustering problem, Ng-Jordan-Weiss (NJW) spectral clustering method adopts the eigenvectors corresponding to the K largest eigenvalues of the normalized affinity matrix derived from a dataset as a novel representation of the original data. However, these K eigenvectors can not always reflect the structure of the original data for some pattern recognition problems. In this paper, a semi-supervised eigenvector selection method for spectral clustering is proposed. This method utilizes some amount of supervised information to search the eigenvector combination which can reflect the structure of the original data, and then obtains more satisfying performance than the classical spectral clustering algorithms. Experimental results on UCI benchmark datasets and MNIST handwritten digits datasets show that the proposed method is effective and robust.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2011年第1期48-56,共9页
Pattern Recognition and Artificial Intelligence
基金
国家973计划项目(No.2006CB705707)
国家863计划项目(No.2008AA01Z125
2009AA12Z210)
国家自然科学基金项目(No.60702062
60970067)
教育部重点项目(No.108115)
高等学校学科创新引智计划项目(111计划)(No.B07048)资助
关键词
谱聚类
特征向量选择
半监督学习
免疫克隆选择
Spectral Clustering, Eigenvector Selection, Semi-Supervised Learning, Immune CloneSelection