摘要
k-means是一种快速有效的聚类算法,但是随着数据量的增加,k-means算法的局限性日益突出。该文从数据预处理,初始聚类中心的选取,最佳聚类数的确定等几个方面优化了k-means算法。仿真实验表明,优化后的k-means算法在稳定性和准确性方面都有很大的提高,证明提出的算法有一定的价值。
The k-means algorithm is fast and effective.With increasing number of data,the limitations of k-means algorithm have become increasingly prominent.This paper presents an improved k-means algorithm from data preprocessing,initial clustering centers choosing and the best number of clusters' determination for better clustering results.The experiments demonstrate that the improved k-means algorithm has a good performance with stability and accuracy.
出处
《杭州电子科技大学学报(自然科学版)》
2009年第4期54-57,共4页
Journal of Hangzhou Dianzi University:Natural Sciences
关键词
聚类
数据预处理
初始聚类中心
clustering
data processing
initial clustering center