摘要
针对基于Hub的聚类算法K-hubs算法存在对初始聚类中心敏感的问题,提出一种基于Hub的初始中心选择策略.该策略充分利用高维数据普遍存在的Hubness现象,选择相距最远的K个Hub点作为初始的聚类中心.实验表明采用该策略的K-hubs算法与原来采用随机初始中心的K-hubs算法相比,前者拥有较好的初始中心分布,能够提高聚类准确率,而且初始中心所在的位置倾向于接近最终簇中心,有利于加快算法收敛.
K-hubs is a Hub-based clustering algorithm that is very sensitive to initialization. Therefore, this paper proposes an initialization method based on Hub to solve this problem. The initialization method takes full use of the feature of the Hubness phenomenon by selecting initial centers that are the most remote Hub points with each other. The experimental results show that compared with the random initialization of ordinary K-hubs algorithm, the proposed initialization method can obtain a better distribution of initial centers, which could enhance the clustering accuracy; moreover, the selected initial centers cart appear near the cluster centers, which could speed up the convergence of the clustering algorithm.
出处
《计算机系统应用》
2015年第4期171-175,共5页
Computer Systems & Applications