摘要
随机选取初始聚类中心和根据经验设置K值对K-means聚类结果都有一定的影响,针对这一问题,提出了一种基于加权密度和最大最小距离的K-means聚类算法,称为KWDM算法。该算法利用加权密度法选取初始聚类中心点集,减少了离群点对聚类结果的影响,通过最大最小距离准则启发式地选择聚类中心,避免了聚类结果陷入局部最优,最后使用准则函数即簇内距离和簇间距离的比值来确定K值,防止了根据经验来设置K值。在人工数据集和UCI数据集上的实验结果表明,KWDM算法不仅提高了聚类的准确率,而且减少了算法的平均迭代次数,增强了算法的稳定性。
Both the random selection of initial clustering center and the empirical determination of K value have a certain impact on K-means clustering results.A K-means clustering algorithm based on weighted density and max-min distance is proposed.The clustering center set is selected by using the weighted density method to reduce the impact of outliers on clustering results.Then the center point is selected by the max-min distance to avoid the clustering result falling into local optimum.Finally,the value of K is determined by the ratio of the distance within clusters to the distance between clusters.Experiments show that the improved algorithm not only improves the accuracy of clustering,reduces the average iteration times of the algorithm,but also enhances the stability of the algorithm.
作者
马克勤
杨延娇
秦红武
耿琳
王丕栋
MA Keqin;YANG Yanjiao;QIN Hongwu;GENG Lin;WANG Pidong(College of Computer Science and Engineering,Northwest Normal University,Lanzhou 730070,China)
出处
《计算机工程与应用》
CSCD
北大核心
2020年第16期50-54,共5页
Computer Engineering and Applications
基金
国家自然科学基金(No.61662068)。