摘要
传统的k_means算法对初始聚类中心十分敏感,聚类结果随不同的初始输入而波动,容易陷入局部最优.为消除这种敏感性,针对k_means算法,提出了一种新的基于数据样本分布选取初始聚类中心的方法,对公共数据库UCI里面的数据实验表明改进后的k_means算法能产生质量较高的聚类结果,并且消除了对初始输入的敏感性.
The traditional k_means algorithm has sensitivity to the initial start center.The clustering accuracy of k_means is affected by the initial start center,and it is very easy to sink into the part best.To solve this problem,for k_means method,we give a new method for selecting initial start center based on sample data distribution to improve the clustering accuracy of k_means.Experiments on the standard database UCI show that the proposed method can produce a high accuracy clustering result and eliminate the sensitivity to the initial start centers.
出处
《兰州交通大学学报》
CAS
2009年第6期15-18,共4页
Journal of Lanzhou Jiaotong University