摘要
传统的K-means算法通过不断的重复计算来完成聚类,聚类中心点的不断变化产生的一些动态变化信息将对聚类产生一定的干扰,且当数据量过大时,算法的时间开销和系统的I/O开销将大大增加,这严重影响了算法的性能。为此,论文提出一种改进的K-means动态聚类算法,该算法充分考虑了K-means聚类过程中信息的动态变化,通过为算法的终止条件设定标准值,来减少算法迭代次数,减少学习时间;通过删除由信息动态变化而产生的冗余信息,来减少动态聚类过程中的干扰,使算法达到更准确更高效的聚类效果。实验结果表明,当数据量较大时,相比于传统的K-means算法,改进后的K-means算法在准确率和执行效率上都有较大的提升。
The traditional K-means algorithm clusters by repetitive computing, the changing cluster centers bring some of the dynamic change information, It will produce interference for clustering. And the large amounts of data will increase the algorithm's time overhead and system I/O overhead, even affect the performance of the algorithm,So,this paper proposed an improved K-means dynamical clustering algorithm. The proposed algorithm takes into account the dynamic information of K-means clustering process and reduces algorithm iterations and learning time by setting the standard value for termination condition of the algorithm, and reduces interference of dynamic clustering by removing redundant information from the changing information to make the algorithm to achieve more accurate and efficient clustering effect. Experimental results show, when the amount of data is large, the improved K- means algorithm is better than the traditional algorithms in accuracy and efficiency.
出处
《重庆师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2016年第1期97-101,共5页
Journal of Chongqing Normal University:Natural Science
基金
河南省科技攻关项目(No.122102210024
No.102102210544)
国家自然科学基金(No.61201447)