摘要
K-means算法是常用的一种聚类分析算法。通常预先选取一个k值,然后再通过选取初始聚类中心进行聚类,直到结果不再收敛。但是传统K-means算法存在k值和初始中心点如何选取的问题,因此针对这一缺陷进行改进。通过密度参数的计算和考虑样本之间距离因素来选取初始聚类中心,并且对聚类有效性指标DBI进行改进,得到新的聚类有效性指标函数IDBI来分析不同k值下的聚类结果,从而得出最佳聚类数。结果表明,IDBI值普遍比DBI小,更加趋于稳定,因此该算法相比传统算法具有更好的收敛性以及更高的准确性。
K-means algorithm is a commonly used algorithm for clustering analysis. Usually,a k value is selected in advance,and then clustering is performed by selecting the initial clustering center,until the result is no longer convergent.However,it is difficult for the traditional K-means algorithms to select the k value and the initial centering point,so an improvement is implemented. The initial clustering center is selected by calculating the density parameter and considering the distance between the samples,and the clustering validity index DBI(Davies-Bouldin index) is improved to obtain a new clustering validity index function IDBI to analyze different k values,so as to get the optimal number of clustering. The results show that the value of IDBI is generally smaller than that of DBI and tends to be more stable. Therefore,this algorithm can get better convergence and higher accuracy than the traditional algorithms.
作者
马钰
莫路锋
MA Yu;MO Lufeng(School of Information Engineering,Zhejiang A&F University,Hangzhou 311300,China)
出处
《现代电子技术》
2021年第17期120-123,共4页
Modern Electronics Technique
基金
国家自然科学基金两化融合重点项目(U1809208)
国家自然科学基金资助项目(61190114)
国家自然科学基金资助项目(61303236)
浙江省自然科学基金资助项目(LY16F020036)。