摘要
传统k-means聚类算法是对某个时间片上的静态数据集合进行独立的聚类分析,但对于时间序列数据仅仅是多次静态聚类分析的重复应用。当数据量过大时,算法的时间开销将大大增加。为此,本文提出了一种时间序列数据的动态k-means聚类算法(Dynamic k-means Clustering Algorithm for Time Series Data,DKCA/TSD)。该算法通过时间序列的前一时刻最优质心的结果,利用数据之间的关联性进行下一时刻的聚类,从而减少算法的迭代次数,提高时间效率。实验结果表明:对于时间序列数据,DKCA/TSD算法相对于k-means算法时间效率上有很大提高。
The traditional k-means clustering algorithm is an independent clustering analysis of static data sets on a certain time slice.However,it is only repeated application of multiple static clustering analysis for time series data.When the amount of da⁃ta is too large,the time overhead of the algorithm will increase greatly.Therefore,this paper proposes a Dynamic k-means Cluster⁃ing Algorithm for Time Series Data(DKCA/TSD).The algorithm uses the correlation between the data to perform the clustering at the next moment through the correlation of the highest quality of the time series,thereby reducing the number of iterations of the al⁃gorithm and improving the time efficiency.The algorithm uses the correlation between the data to perform the clustering at the next moment through the correlation of the highest quality of the time series,thereby reducing the number of iterations of the algorithm and improving the time efficiency.The experimental results show that for time series data,the time efficiency of DKCA/TSD algo⁃rithm is greatly improved compared with k-means algorithm.
作者
冀敏杰
肖利雪
JI Minjie;XIAO Lixue(School of Computer Science,Xi'an University of Posts and Telecommunications,Xi'an 710121)
出处
《计算机与数字工程》
2020年第8期1852-1857,共6页
Computer & Digital Engineering
基金
西安邮电大学研究生创新基金项目(编号:103-602080016)资助。