摘要
针对传统的基于滑动窗口的数据流聚类算法存在的算法执行效率低、聚类质量较差等缺点,提出了一种基于混合差分进化的滑动窗口数据流聚类算法。该算法将数据流聚类过程分为两个部分:在线的时序窗口数据信息微簇特征向量生成和离线阶段的聚类优化。对在线生成的微簇进行微簇集合的更新与维护,利用改进的粒子群算法对离线的微簇数据信息进行适应度值的计算,将种群分为优势子种群和普通子种群,然后利用个体适应度值和平均适应度值的判别来生成当前个体环境的最优候选解,并迭代地对个体进行进化,输出具有最优适应度值的聚类集合,完成对数据流的聚类。仿真实验结果表明,算法在对数据流执行聚类时具有较高的执行效率,并且最后聚类的质量较好,算法实用性强。
In order to improve the execution efficiency and clustering quality of data stream clustering algorithm based on the sliding window, this paper presented a new a sliding window data stream clustering algorithm based on hybrid differential evolu- tion. First,it divided data stream clustering process into two parts that were micro-clusters eigenvector of online timing-window and the offline clustering optimization, thus to update and maintenance the collection of micro-clusters which were generated online. Second, it calculated fitness value of offline micro-cluster data by using the improved particle swarm optimization and divided the population into the advantaged sub-population and the normal one. Then generated individual environment optimal candidate solutions by using individual and average fitness value of the discriminant. Finally, it performed iteration of individual evolution and output the optimal fitness value clustering collection.
出处
《计算机应用研究》
CSCD
北大核心
2014年第4期1009-1012,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(60603047)
国家教育部留学回国人员科研启动基金资助项目辽宁省计划项目(2012232001)
辽宁省自然科学基金资助项目(201202119)
关键词
混合差分进化
滑动窗口
数据流
聚类
hybrid differential evolution
sliding window
data flow
clustering