摘要
数据流具有数据量无限且流速快等特点,使得传统的聚类算法不能直接应用于数据流聚类问题。针对上述问题,提出了一种可以聚类单数据流和多数据流的聚类算法。此算法现阶段应用了两种概化技术,基于小波的技术和基于回归的技术来构造摘要层次结构。基于回归的拟合模型可以得到较精确的摘要层次结构,而基于小波的拟合模型可以快速地建立摘要层次结构并且所需的存储空间比较小。
Data stream is characteristic of infinite data and quick stream speed, so traditional clustering algorithm can not be applied to data stream clustering directly. In view of the above questions, a new clustering algorithm is proposed, which can be applied to both single data stream and multiple data streams clustering. Two summarization techniques based on wavelets and regression are proposed to maintoin summary hierarchies. The regression-based hierarchy can be calculated more accurately, and the wavelet-based hierarchy can be built faster while using less storage space than the regression-based one.
出处
《计算机应用与软件》
CSCD
北大核心
2007年第10期176-178,共3页
Computer Applications and Software
关键词
数据流
聚类
小波转换
回归分析
Data stream Clustering Wavelet transform Regression analysis