摘要
作为当前数据流挖掘研究的热点之一,多数据流聚类要求在跟踪多个数据流随时间演化的同时按其相似程度进行划分.文中提出一种基于灰关联分析并结合近邻传播聚类的多数据流聚类方法.该方法基于一种灰关联度,将多个数据流的原始数据压缩成可增量更新的灰关联概要信息,并根据该信息计算多个数据流之间的灰关联度作为其相似性测度,最后应用近邻传播聚类算法生成聚类结果.在真实数据集上的对比实验证明该方法的有效性.
As a hot research orientation of data stream mining, multiple data stream clustering tracks the evolution of multiple streams and partitions them according to their similarities. In this paper, a multiple data stream clustering approach is proposed, which is based on the combination of grey relational analysis and affinity propagation clustering. A grey relational degree is developed so that the raw data can be compressed into an incrementally updatable grey relational synopsis. The similarity between two data streams is measured by the grey relational degree calculated from the synopsis. Finally, the affinity propagation algorithm is used to cluster the streams. The experiments on the real data sets prove the effectiveness of the new method.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2011年第6期769-775,共7页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金(No.70871024)
福建省自然科学基金(No.2010J01358)资助项目
关键词
聚类
多数据流
灰关联分析
Clustering, Multiple Data Streams, Grey Relational Analysis