摘要
k-means(k均值)算法是聚类方法中常用的一种划分方法。该算法适合对海量数据进行聚类,对球状、凸形分布的数据具有很好的聚类效果,但该算法有其突出的局限性,少量的孤立点就会对聚类结果产生很大的影响,因此,采用聚类均值点与聚类种子相分离的思想,给出了基于该思想的对k均值算法的改进算法。实验表明,该改进算法比原k均值算法具有更高的准确性。
K-means algorithm is a widely used partition method in clustering. The algorithm is suitable for the spherical wave data. The algorithm has a good result for spherical, protruding data. However, the algorithm has its prominent limitations. A small number of isolated points would have a considerable impact on the clustering results. This paper study presents an idea to separate the clustering centroid from the clustering seed and completes an algorithm based on this idea, improving the k-means algorithm. It also provides a specific ideology based on the k-means algorithm to improve the algorithm. The paper presents the results of the experiments to prove that this algorithm is more veracious than the k-means algorithm.
出处
《北京印刷学院学报》
2007年第2期63-65,共3页
Journal of Beijing Institute of Graphic Communication