摘要
传统的k-means方法和层次聚类算法,当数据集出现离群点或簇间存在交叠时会产生错误聚类结果。受小波多分辨率分析启发,提出一种基于图金字塔的聚类算法。首先输入数据集的类别数目K,并对数据点构建最小生成树;然后按节点的度数与最短邻边的长度计算优先级;接着,按优先级由高到低遍历最小生成树,进行节点间的合并;最后输出由合并节点构成的聚类结果。在人工合成和真实数据集上的实验结果表明,与k-means方法和层次聚类法相比,该方法的聚类结果不受离群点和簇间有交叠的影响,具有较高的稳定性。
The traditional k-means method and hierarchical clustering algorithm produce incorrect clustering results when there is an outlier or cluster overlap in a data set. Inspired by wavelet multiresolution analysis,this paper proposed a clustering method based on graph pyramid. Firstly,the number of classes in dataset K was input,and the minimum spanning tree was constructed for the data points. Then,the priority was calculated according to the degrees of the nodes and the length of the shortest neighbor. Next,the minimum spanning tree was traversed from high to low by priority.Finally,the clustering results formed by the merged nodes were output. The experimental results on the synthesized and true data sets showed that compared with k-means and hierarchical clustering method,the clustering results of this method were not affected by the overlap between outliers and clusters and had high stability.
出处
《计算机应用与软件》
北大核心
2018年第2期256-260,315,共6页
Computer Applications and Software
基金
国家自然科学基金项目(61373004)
校创新团队项目(A700115001005)
校基金项目(Sk201220)
关键词
聚类金字塔
多分辨率
最小生成树
层次聚类
Clustering
Pyramid
Multiresolution
Minimum spanning tree
Hierarchical clustering