摘要
针对传统K-means算法的聚类不稳定性,提出一种基于相异度与邻域的初始聚类中心选择算法。该算法首先构造相异度矩阵,建立每个样本点的邻域,选取K个相互距离较远且邻域内样本点较密集的初始聚类中心。采用K-means算法思想,利用UCI中的三种数据集进行实验。结果表明,相比传统K-means算法,新算法有稳定的聚类结果,且对比于已经提出的两种改进算法,新的算法在保持准确率的前提下,迭代次数有较大程度的减少。
Aiming at the clustering instability of traditional K-means algorithm,an initial cluster center selection algorithm based on dissimilarity and neighborhood is proposed.The algorithm constructs a dissimilarity matrix,establishes the neighborhood of each sample point,and selects K initial cluster centers that are far apart from each other and the sample points are denser in the neighborhood.The idea of K-means algorithm is adopted,and three data sets in UCI are used for experiment.The results show that compared to the traditional K-means algorithm,the new algorithm has stable clustering results,and compared to the two improved algorithms that had been proposed,the new algorithm has a greater reduction in the number of iterations while maintaining accuracy.
作者
张嘉龙
Zhang Jialong(College of Mathematics and Information,South China Agricultural University,Guangzhou,Guangdong 510642,China)
出处
《计算机时代》
2021年第8期57-59,62,共4页
Computer Era