摘要
针对全局K-均值算法时间复杂度大的问题,提出一种增量选择初始聚类中心的新方法。选择数据集中周围分布最密集的样本作为第一个初始聚类中心,选择最小化目标函数贡献大,并且和已有聚类中心距离远的样本作为下一个初始聚类中心。改进算法减少了增量选取初始聚类中心时的计算量,降低了时间复杂度。实验证明,改进算法与全局K-均值算法、快速全局K-均值算法相比,在不影响聚类效果的基础上,减少了聚类时间,与优化初始聚类中心的算法相比,聚类效果更优。
Aiming at the problem of the global K-means algorithm time complexity,this paper proposes a new method about incremental selection of the initial clustering center. Selecting the data centralized and surrounded by the most in-tensive data sample as the first choice of the initial cluster center, selecting the minimum of the objective function's contribution is large,in addition,it's far away from the center of the existing cluster’s sample as the next initial clus-tering center. Improved algorithm to reduce the calculation about increment selecting the initial cluster center, to reduce the time complexity. Experiment proved that the improved algorithms compared with the global k-means algorithm and the fast global k-means algorithm,on the basis of don’t affect the clustering effect,reducing the clustering time,com-pared with the optimal initial clustering center algorithm,it’s better clustering effect.
出处
《长春理工大学学报(自然科学版)》
2015年第3期112-115,共4页
Journal of Changchun University of Science and Technology(Natural Science Edition)