期刊文献+

基于事件驱动的MapReduce类流量产生方法与网络评测 被引量:1

Event-Driven Method for MapReduce Traffic Generation and Network Evaluation
下载PDF
导出
摘要 大规模网络结构设计是构建大规模分布式系统和E级高性能计算集群的核心技术之一,底层网络设计者需要结合顶层应用通信流量特征,进行网络结构选型与优化.不当的应用通信模型会引起网络结构设计与实际需求的背离,进而导致系统通信和整体性能的下降.传统基于"黑盒"数据分析的流量建模方法存在业务建模粒度粗和应用数据规模扩展性差等缺陷.该研究引入模拟业务内部逻辑的"事件驱动"思想,提出一种针对主流计算模式MapReduce进行流量建模与流量产生方法.与真实应用流量的对比评测显示,该方法能够准确体现MapReduce计算业务所产生网络流量的特征.基于正确的流量模型,该文对四种主流数据中心网络进行了性能模拟分析.结果表明:相较负载随机均匀分布流量,同一种网络在负载MapReduce特性流量时性能将下降超过30%,因此特性流量能更加明显地展现网络拥塞与瓶颈问题.仿真实验所得到的有关网络性能瓶颈、拓扑可扩展性以及网络性价比的结论,为大规模数据中心网络选型和性能优化提供了新的依据. Interconnection network design is one of the core technologies in the constructions of exascale clusters and large-scale distributed systems.Such large-scale computing system is expected to be achieved in the near future due to the rapid innovations of semiconductor logic and memory,architectures,interconnections and other industry technologies.Among these,due to performance and cost factors,interconnection network plays a critical role in such a large-scale computing system.In large-scale clusters or datacenter,the design of interconnection network is facing greater challenges.Firstly,the increasing computing capacity of a single node requires the network providing higher bandwidth and lower latency.Secondly,the increasing number of nodes requires the network has extremely better scalability.Thirdly,the increasing scale of system leads to worse performance of collective communication,which is harmful to the performance and scalability of applications.Fourthly,the increasing number of devices requires the network has better reliability.As the performance of compute nodes keep increasing,interconnection network has gradually become the bottleneck of large-scale computing system.However,switch chip,the core component of interconnection network,can offer limited aggregate bandwidth because of the constraint of physical processes and packaging technologies.The underlying network designers should consider the processing characteristics of the network traffic when selecting and optimizing the network architecture.Improper traffic model will cause the departure between network architecture and characteristics of communication,which will reduce the overall performance of data centers and clusters.Big data platform has the cost-effective advantage of data processing with the feature of simplified programming and parallel computing,which has being more and more recognized by the industry.In recent years,the community of high-performance computing is also increasingly using Big data platform for HPC data processing,which has become a powerful means for scientific data analysis gradually.Scientific data traffic generated by the application of HPC tends to have many requirements,including high quality processing,compute in communication link and huge date size,which is called as“high-throughput”traffic.The scale of data processing and the port cost of network need to be considered during the design of datacenter for distributed computer system.The most widely used model for computing and communication in distributed system is MapReduce.The traditional traffic generation method for“Black-Box”is coarse granularity with poor scalability.Therefore,this paper presents a methodology for MapReduce traffic modeling and generation based on the idea of“event-driven”.The accuracy evaluation,which compared our methodology with the real application traffic,indicates that the traffic generated by our method can accurately reflect the characteristics of the network traffic generated by MapReduce in distributed computing system.Our performance simulation analysis and bottleneck analysis of four major data center networks,which is conducted by using the characteristic flow in network simulator,shows that the difference of network performance between the one loaded with MapReduce traffic and the one loaded with uniform random traffic is more than 30%,indicating that characteristic traffic could more obviously reveal the issues of network congestion and bottleneck.The results of our simulation,related to the bottleneck of network performance,topology scalability and network cost-effectiveness,provide a new way for large-scale data center network selection and network performance optimization.
作者 邵恩 孙凝晖 郭嘉梁 元国军 王展 曹政 SHAO En;SUN Nin-Hui;GUO Jia-Liang;YUAN Guo-Jun;WANG Zhan;CAO Zheng(State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049)
出处 《计算机学报》 EI CSCD 北大核心 2018年第10期2265-2281,共17页 Chinese Journal of Computers
基金 国家重点研发计划项目(2016YFB0200300 2016YFGX030148 2016YFB0200205 2016GZKF0JT006) 国家自然科学基金项目(61572464 61402444) 中国科学院战略性先导科技专项(XDB24060600)资助~~
关键词 分布式系统 MAPREDUCE 数据中心网络 事件驱动 大规模网络模拟 distributed system MapReduce data center network event-driven large-scale network simulation
  • 相关文献

参考文献3

二级参考文献44

  • 1郑军,胡铭曾,云晓春,郑仲.基于数据流方法的大规模网络异常发现[J].通信学报,2006,27(2):1-8. 被引量:17
  • 2Condie T,Conway N,Alvaro P, et al. MapReduce online[C] //Proc of USENIX NSDI’10. Berkeley, CA: USENIXAssociation, 2010; 313-328.
  • 3Yu Yuan, Isard M,Fetterly D,et al. DryadLINQ: A systemfor general-purpose distributed data-parallel computing using ahigh-level language [C] //Proc of USENIX OSDI'08. Berkeley,CA: USENIX Association, 2008: 1-14.
  • 4Murray D G,Schwarzkopf M, Smowton C, et al. CIEL: Auniversal execution engine for distributed data-flowcomputing [C] //Proc of USENIX NSDI’ll. Berkeley, CA:USENIX Association, 2011.
  • 5Malewicz G, Austern M H,Bik A J C, et al. Pregel: Asystem for large-scale graph processing [C] //Proc of ACMSIGMOD'IO. New York: ACM, 2010: 135-146.
  • 6Zaharia M,Chowdhury M, Franklin M J,et al. Spark:Cluster computing with working sets [J]. Book of Extremes,2010, 15(1): 1765-1773.
  • 7Chowdhury M? Zaharia M, Ma J, et al. Managing datatransfers in computer clusters with orchestra [C] //Proc ofACM SIGCOMM'll. New York: ACM, 2011: 98-109.
  • 8Al-Fares A, Loukissas A,Vahdat A. A scalable, commoditydata center network architecture [C] //Proc of ACMSIGCOMM'08. New York: ACM, 2008; 63-74.
  • 9Greenberg A, Jain N,Kandula S, et al. VL2: A scalableand flexible data center network [C] //Proc of ACMSIGCOMM'09. New York; ACM, 2009: 51-62.
  • 10Mysore R,Pamboris A, Farrington N. PortLand: A scalablefault-tolerant layer 2 data center network fabric [C] //Proc ofACM SIGCOMM'09. New York: ACM,2009: 39-50.

共引文献62

同被引文献10

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部