摘要
针对传统的K-means算法对于初始聚类中心点和聚类数的敏感问题,提出了一种优化初始聚类中心选取的算法。该算法针对数据对象的分布密度以及计算最近两点的垂直中点方法来确定k个初始聚类中心,再结合均衡化函数对聚类个数进行优化,以获得最优聚类。采用标准的UCI数据集进行实验对比,发现改进后的算法相比传统的算法有较高的准确率和稳定性。
Aiming at the problem of traditional K-means algorithm which is sensitive to initial clustering center and the number of cluster,this paper proposed a kind of optimization algorithm of initial clustering center selection.The algorithm was accor-ding to the distribution density of data and calculated the two vertical halfway points recently to determine the initial clustering center,then combined the equalization function to optimize the cluster number and got the optimal cluster.Used the standard UCI data sets as the contrast experiment objects,and found that the improved algorithm has the high accuracy and relative stability compared with traditional algorithm.
出处
《计算机应用研究》
CSCD
北大核心
2012年第5期1726-1728,共3页
Application Research of Computers
基金
湖南省教育厅创新平台开放基金资助项目(11K069)
湖南省自然科学基金资助项目(07JJ6115)
智能制造湖南省高校重点实验室资助项目(2009IM06)
关键词
K-均值
数据挖掘
聚类中心
垂直中点
密度
K-means
data mining
clustering center
vertical halfway point
density