期刊文献+

一种具有优良抗噪性能的初始聚类质心选择算法

Novel Anti-noise K-means Algorithm Based on Spatial Distance Difference
下载PDF
导出
摘要 K-means算法由于其固有的初始聚类质心敏感性,存在聚类结果不稳定、容易收敛到局部最优等问题。现有改进方案在处理无噪数据集时能够在降低迭代次数的同时得到近似全局最优解,但在处理有噪数据集时容易陷入局部最优,甚至聚类效果低于传统的K-means算法。在最远空间距离确定初始质心算法的基础上,提出一种基于空间距离差的初始质心选择算法。该算法的核心思想是通过计算非聚类质心点到已选质心的距离和,并排序,选取相邻距离差最大的两点中靠近已知质心的点作为下一个簇的初始质心而实现的。实验结果表明,所提算法在聚类迭代次数相当的情况下,对不含噪声数据集的聚类准确度增加约1%,对于含有噪声的数据集,聚类准确度达到90%以上。 Due to the inherent initial clustering center sensitivity of K-means algorithm,it exists problems including result instability and being easy to fall into local optimum.The current improvement schemes can reduce the number of iteration and obtain an approximate global optimal solution when deal with noise-free data sets.But for noisy data sets,it would be easy to fall into local optimum,and the clustering result is lower than traditional K-means algorithm.Based on the algorithm that can find initial clustering centers according to the farthest spatial distance,the paper proposed a novel algorithm to select initial centers based on spatial distance difference.The main idea of the algorithm is calculating the sum distances between non-clustering center and all selected centers,then sort them.Choose the point which is the closer to the given centers as the new selected cluster center.Experimental results show that under the quite condition of iteration,when deal with noise-free data sets,the clustering accuracy of the proposed algorithm is improved about 1%.For noisy data sets,the classified accuracy is above 90%.
出处 《计算机科学》 CSCD 北大核心 2014年第S1期406-408,420,共4页 Computer Science
基金 重庆市交通委员会科学计划项目:基于RFID的车辆非法营运监控与特征提取资助
关键词 K-MEANS算法 初始质心 空间距离差 噪声数据 K-means algorithm,Initial centroid,Spatial distance difference,Noisy data
  • 相关文献

参考文献8

二级参考文献54

共引文献242

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部