期刊文献+

面向大规模感知数据的实时数据流处理方法及关键技术 被引量:9

Real-time data stream processing and key techniques oriented to large-scale sensor data
下载PDF
导出
摘要 为了在大规模历史感知数据基础上实现针对高速传感数据流的实时计算,提出一种面向大规模历史数据的数据流处理方法RTMR,通过中间结果缓存、流水化和本地化改进了MapReduce的数据流处理能力。在此基础上,为了适应性地构造RTMR集群,利用模型分析方法根据应用特征和集群环境配置节点类型和拓扑结构。为实现集群的负载均衡,通过计算负载状态转换关系分组空闲节点和过载节点,将NP难的动态负载均衡问题快速分解为规模较小的子问题,并且综合执行时间和数据移动代价作为子问题的优化目标,提高应对负载倾斜的反应速度。实验表明,上述方法和技术能够保障大规模历史数据上数据流处理的可伸缩性。 With the development of Internet of Things, how to realize real time computation for high speed data stream based on large scale history sensor data became a new challenge to cloud manufacturing. A processing meth- od named Real-Time MapReduce (RTMR) oriented to large scale historical data was proposed, which improved data stream processing capacity of MapReduce through intermediate result cache, pipelining and localization. To con- struct RTMR sets, the model analysis method was used to configure the node type and topological structure based on application characteristics and cluster environments. Furthermore, to realize cluster load balancing, the idle nodes and overload nodes were grouped by computing load state transition relation. Thus the dynamic load balancing problem of NP hard was decomposed into small scale sub-problems, and execution time as well as data cost were in- tegrated as sub-problem's optimization objective. The experiment result showed that the proposed method and tech- nology could ensure the scalability for data stream processing of large scale historical data.
出处 《计算机集成制造系统》 EI CSCD 北大核心 2013年第3期641-653,共13页 Computer Integrated Manufacturing Systems
基金 国家自然科学基金资助项目(60903137 60970132)~~
关键词 数据流处理 大规模数据处理 MapReduce方法 适应性架构 负载均衡 data stream processing large scale data processing MapReduce adaptive architecture~ load balance
  • 相关文献

参考文献16

  • 1MOTWANI R, WIDOM J, ARASU A, et al. Query process- ing, resource management, and approximation in a data stream management system[C]//Proceedings of the 1st Biennial Con- ference on Innovative Data Systems Research. New York, N. Y., USA.- ACM Press, 2003..176-187.
  • 2ABADI D J, AHMAD Y, BALAZINSKA M, et al. The de- sign of the Borealis stream processing engine[C]// Proceed- ings of the 2nd Biennial Conference on Innovative Data Systems Research. New York, N. Y., USA: ACM Press, 2005.. 277-289.
  • 3金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 4DEAN J, GHEMAWAT S. MapReduce: simplified data pro- cessing on large clusters[J]. ACM Communication, 2008, 51 (1):107-113.
  • 5SHAH M A, HELLERSTEIN J M, CHANDRASEKARAN S, et al. Apache hadoop [EB/OL]. [2011-08-17]. http..//ha- doop. apache, org/.
  • 6RANGER C, RAGHURAMAN R, PENMETSA A, et al. E- valuating map reduce for multi-core and multiprocessor sys- tems[C]//Proceedings of the 13th International Conference on High-Performance Computer Architecture. Washington, D. C. , USA: IEEE Computer Society, 2007: 13-24.
  • 7KAASHOEK F, MORRIS R, MAO Y. Optimizing MapRe- duce for multicore architectures[R]. Boston, Mass. , USA: MIT Computer Science and Artificial Intelligence Laboratory, 2010.
  • 8CHANG F, DEAN J, GHEMAWAT S, et al. Bigtable: a distributed storage system for structured data[C]//Proceed- ings of the 7th Symposium on Operating Systems Design and Implementation. Berkeley, Cal. , USA: USENIX Association, 2006: 205-218.
  • 9HEISS H, SCHMITZ M. Decentralized dynamic loadbalanc- ing: the particles approach[J]. Information Sciences, 1995, 84(2):115-128.
  • 10刘振英,方滨兴,胡铭曾,张毅.一个有效的动态负载平衡方法[J].软件学报,2001,12(4):563-569. 被引量:37

二级参考文献68

  • 1韩东红,王国仁.数据流系统中卸载技术研究综述[J].计算机科学,2005,32(8):102-105. 被引量:3
  • 2温钰洪,王鼎兴,郑纬民.异构机群系统中的最优处理机分配算法[J].计算机学报,1996,19(3):161-167. 被引量:8
  • 3Hui Chichung,J Parallel Distributed Computing,1997年,43卷,2期,139页
  • 4Zaki M J,J Parallel Distributed Computing,1997年,43卷,2期,156页
  • 5Babcock B, Babu S, Datar M, Motwani R, Widom J. Models and issues in data streams. In: Popa L, ed. Proc. of the 21st ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. Madison: ACM Press, 2002. 1~16.
  • 6Terry D, Goldberg D, Nichols D, Oki B. Continuous queries over append-only databases. SIGMOD Record, 1992,21(2):321-330.
  • 7Avnur R, Hellerstein J. Eddies: Continuously adaptive query processing. In: Chen W, Naughton JF, Bernstein PA, eds. Proc. of the 2000 ACM SIGMOD Int'l Conf. on Management of Data. Dallas: ACM Press, 2000. 261~272.
  • 8Hellerstein J, Franklin M, Chandrasekaran S, Deshpande A, Hildrum K, Madden S, Raman V, Shah MA. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 2000,23(2):7-18.
  • 9Carney D, Cetinternel U, Cherniack M, Convey C, Lee S, Seidman G, Stonebraker M, Tatbul N, Zdonik S. Monitoring streams?A new class of DBMS applications. Technical Report, CS-02-01, Providence: Department of Computer Science, Brown University, 2002.
  • 10Guha S, Mishra N, Motwani R, O'Callaghan L. Clustering data streams. In: Blum A, ed. The 41st Annual Symp. on Foundations of Computer Science, FOCS 2000. Redondo Beach: IEEE Computer Society, 2000. 359-366.

共引文献207

同被引文献106

引证文献9

二级引证文献95

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部