摘要
综合分析了数据流分类算法以及云计算的基本理论,提出了基于Hadoop框架的数据流系综分类算法,算法采用MapReduce并行编程模型对传统基于动态权重系综模型进行改进,以提升算法的分类效率.分析结果表明,该算法在处理快速海量到达的数据流时,其执行效率远高于传统系综算法.
According to comprehensive analysis on data streams classification algorithms and the basic theory of cloud computing,it is proposed an ensemble classification algorithm for data streams running on Hadoop framework,and it takes MapReduce parallel programming model to improve traditional dynamic weight-based ensemble,finally speed up classification efficiency.Results show that the algorithm for high speed massive data stream has much better running efficiency than traditional ensemble algorithm.
出处
《微电子学与计算机》
CSCD
北大核心
2012年第2期99-102,共4页
Microelectronics & Computer
基金
"十一五"国家科技支撑计划课题(2009BAH53B03)