摘要
自动确定聚类数和海量数据的处理是谱聚类的关键问题。在自动确定聚类数谱聚类算法的基础上,提出了一种能处理大规模数据集的多层算法。该算法的核心思想是把大规模数据集根据一定的相关性逐级进行合并,使之成为小数据集,再对分组后的小数据集用自动确定类别的谱聚类算法聚类,最后逐层进行拆分并微调,完成全部数据的聚类。实验证明该算法的聚类效果很好。
Ascertainable clustering number and large training sets are vital problems of spectral clustering. This paper proposed a multilevel algorithm based on spectral clustering of ascertainable clustering number, which can cope with large training sets. The core thought of the algorithm was to merge the nodes of large datasets level by level according to its intrinsic relevance into small groups, Then the small groups of last level were clustered using ascertainable clustering number algorithm. Finally, the data was uneoarsened level by level, and at each level, the clustering from the previous level was refined using the refinement. The multilevel algorithm terminated after refinement was performed on the original data, Experimental results prove the effectiveness of the algorithm.
出处
《计算机应用》
CSCD
北大核心
2008年第5期1229-1231,共3页
journal of Computer Applications
基金
浙江省自然科学基金资助项目(Y106085)
关键词
谱聚类
聚类数
图像分割
spectral clustering
clustering number
image segmentation