摘要
针对近邻传播(AP)聚类算法的计算复杂度和准确性,该文提出一种分层组合的半监督近邻传播聚类算法(SAP-SC)。算法引入"分层聚类"的思想,将一次AP聚类过程等分成若干层聚类,使得处理过程简单、易于实现;每层只关注聚类"困难"的数据点,并通过构造"成对点约束"和使用"子簇标签映射"进行半监督学习;基于"组合提升"的方法将各层聚类结果加权叠加,从而提升了算法的准确性能。理论分析和实验结果表明:算法在聚类准确性和计算复杂度方面有了较大改进。
Considering the complexity and the accuracy, an improved affinity propagation clustering algorithm Semi-supervised Affinity Propagation clustering algorithm based on Stratified Combination (SAP-SC) is proposed. In order to make the operation simplified and easily-implemented, the proposed algorithm introduces a stratified clustering method which equally partitions the integrative clustering process into several smaller blocks. Focusing on the hard clustering data, every layer employs semi-supervised learning to conceive pair-wise constraints and maps each sub-cluster with the corresponding label. Also, assembled boosting method is utilized to weight together all layered results to improve the clustering performance. Finally, theoretical analysis and experimental results show that the algorithm can achieve both higher accuracy and better computational performance.
出处
《电子与信息学报》
EI
CSCD
北大核心
2013年第3期645-651,共7页
Journal of Electronics & Information Technology
基金
国家973重点基础研究发展基金(2012CB312901
2012CB312905)
国家863计划项目(2011AA01A103)资助课题
关键词
半监督学习
近邻传播聚类
分层聚类
组合提升
Semi-supervised learning
Affinity Propagation (AP) clustering
Stratified clustering
Assembledboosting