期刊文献+

分布式流处理系统的容错性能基准测试

Benchmarking for Fault-tolerant Performance in Distributed Stream Processing Systems
下载PDF
导出
摘要 随着对数据处理的实时性要求越来越高,分布式流处理系统应运而生。但是在分布式的集群规模下,各种软硬件原因导致的故障很难避免的。现有的相关基准测试主要关注于分布式流处理系统的处理性能,很少对该类系统处理故障的容错性能进行评测,以至于关键应用在系统选型的时候特别艰难。针对分布式流处理系统的容错性能,本文设计并实现了一套灵活的基准测试框架。最后,本文在开源数据流处理系统Apache Storm和Apache Flink进行了容错性能的基准测试,验证定义的测试基准的正确性和有效性,实验结果也表明Flink的容错性能相对较好。 With the increasing real-time requirements for data processing,distributed stream processing systems have emerged.However,under the distributed cluster scale,failures caused by various hardware and software problems are inevitable.The existing related benchmarking mainly focus on the performance of the distributed stream processing system during failure-free time,while rarely evaluating the fault-tolerant performance of the system for handling faults.As a result,it is particularly difficult to select a system for mission-critical applications.This paper designs and implements a flexible benchmarking framework tailored for fault-tolerant performance.Finally,benchmarking the fault-tolerant performance of Apache Storm and Apache Flink verifies the correctness and effectiveness of the benchmark defined in this paper.Experimental results show that fault-tolerant performance of Flink outperforms that of Storm.
作者 蒋程 王晓桐 张蓉 JIANG Cheng;WANG Xiaotong;ZHANG Rong(School of Data Science and Engineering,East China Normal University,Shanghai 200062,China)
出处 《软件工程》 2019年第12期5-10,共6页 Software Engineering
基金 科技部重大专项(2018YFB1003402) 国家自然科学基金资助项目(61432006)
关键词 分布式系统 流处理 容错性能 基准测试 distributed system stream processing fault-tolerant performance benchmarking
  • 相关文献

参考文献2

二级参考文献53

  • 1Babcock B, Babu S, Datar M, Motwani R, Widom J. Models and issues in data streams. In: Popa L, ed. Proc. of the 21st ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems. Madison: ACM Press, 2002. 1~16.
  • 2Terry D, Goldberg D, Nichols D, Oki B. Continuous queries over append-only databases. SIGMOD Record, 1992,21(2):321-330.
  • 3Avnur R, Hellerstein J. Eddies: Continuously adaptive query processing. In: Chen W, Naughton JF, Bernstein PA, eds. Proc. of the 2000 ACM SIGMOD Int'l Conf. on Management of Data. Dallas: ACM Press, 2000. 261~272.
  • 4Hellerstein J, Franklin M, Chandrasekaran S, Deshpande A, Hildrum K, Madden S, Raman V, Shah MA. Adaptive query processing: Technology in evolution. IEEE Data Engineering Bulletin, 2000,23(2):7-18.
  • 5Carney D, Cetinternel U, Cherniack M, Convey C, Lee S, Seidman G, Stonebraker M, Tatbul N, Zdonik S. Monitoring streams?A new class of DBMS applications. Technical Report, CS-02-01, Providence: Department of Computer Science, Brown University, 2002.
  • 6Guha S, Mishra N, Motwani R, O'Callaghan L. Clustering data streams. In: Blum A, ed. The 41st Annual Symp. on Foundations of Computer Science, FOCS 2000. Redondo Beach: IEEE Computer Society, 2000. 359-366.
  • 7Domingos P, Hulten G. Mining high-speed data streams. In: Ramakrishnan R, Stolfo S, Pregibon D, eds. Proc. of the 6th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. Boston: ACM Press, 2000. 71-80.
  • 8Domingos P, Hulten G, Spencer L. Mining time-changing data streams. In: Provost F, Srikant R, eds. Proc. of the 7th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM Press, 2001. 97~106.
  • 9Zhou A, Cai Z, Wei L, Qian W. M-Kernel merging: Towards density estimation over data streams. In: Cha SK, Yoshikawa M, eds. The 8th Int'l Conf. on Database Systems for Advanced Applications (DASFAA 2003). Kyoto: IEEE Computer Society, 2003. 285~292.
  • 10Gibbons PB, Matias Y. Synopsis data structures for massive data sets. In: Tarjan RE, Warnow T, eds. Proc. of the 10th Annual ACM-SIAM Symp. on Discrete Algorithms. Baltimore: ACM/SIAM, 1999. 909-910.

共引文献163

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部