期刊文献+

BDSim:面向大数据应用的组件化高可配并行模拟框架 被引量:5

BDSim:A Component-Based Highly Configurable Parallel Simulation Framework for Big-Data Application Evaluation
下载PDF
导出
摘要 大规模并行模拟是研究大数据体系结构的重要方法,对大数据应用及众核体系结构的发展有着不可替代的推动作用.然而,目前的模拟技术不能满足大数据体系结构研究的需求,主要体现在模拟速度慢、配置过程复杂以及可扩展性差等方面.为了解决此问题,评估面向大数据应用的高通量众核体系结构的性能与功耗,该文提出了面向大数据应用的并行模拟框架——BDSim.该框架基于组件化思想,将功能组件与框架服务单元组成并行功能单元,并可根据负载情况,自由配置组件与框架服务单元之间的映射关系.为了提高组件之间的通信和同步效率,该文提出了一种非阻塞无锁通信优化方法,和一种CMB保守同步算法的优化算法——NMTRT-CMB同步算法.模拟不同并发规模的基于2D-Mesh网络的众核系统的实验结果表明,与基于锁的并行通信方法相比,框架采用的非阻塞无锁通信优化方法可以提高并行模拟速度约10%,该算法与CMB同步算法相比,NMTRT-CMB同步算法可以减少空消息数量达90%以上. Large-scale parallel simulation is an important method for big-data architecture research,which plays an irreplaceable role in promoting big data application and many-core architecture development.However,the simulation techniques cannot meet the needs of big dataarchitecture research currently,mainly reflected in respects of low simulation speed,complicate configuration,poor scalability,and so on.To address these problems,this paper proposed BDSim,a highly configurable parallel simulation framework for big data application simulation.This framework is able to evaluate the performance and energy consumption of high throughput computing architecture which targets to big data applications.The basic idea of BDSim is based on the thought of component.In BDSim,aparallel function unit consists of several function components and a framework service(FS)unit.FS unit is the service agent for function components which are attached to it. The mapping between function components and a framework service unit is depended on loadings of function units.To improve communication efficiency,this paper proposed an optimized non-block lock-free communication method.The NMTRT-CMB synchronization algorithm based on CMB conservative synchronization algorithm was also presented to improve synchronization efficiency.The experiments were conducted with many-core architecture based on 2D-Mesh NOC under different parallel scale.According to the result,non-block lock-free communication method can help improving simulation speedup by10%,compared to communication based on locking method. NMTRT-CMB reduces null messages by almost 90% when running with 16 threads,compared to CMB.
出处 《计算机学报》 EI CSCD 北大核心 2015年第10期1959-1975,共17页 Chinese Journal of Computers
基金 国家"九七三"重点基础研究发展规划项目基金(2011CB302501) 国家"八六三"高技术研究发展计划项目基金(2012AA010901 2015AA011204) "核高基"国家科技重大专项基金项目(2013ZX0102-8001-001-001) 国家自然科学基金(61173007 61204047 61332009)资助~~
关键词 组件化并行模拟框架 并行离散事件模拟 非阻塞无锁通信 CMB算法 高可配 大数据 component modular parallel simulation framework PDES non-block lock-free communication CMB algorithm highly configurable big data
  • 相关文献

参考文献32

  • 1李国杰.大数据对计算机系统的挑战.中国计算机学会通讯,2013,9(12):33-35.
  • 2王元卓,靳小龙,程学旗.网络大数据:现状与展望[J].计算机学报,2013,36(6):1125-1138. 被引量:706
  • 3Ferdman M, Adileh A, Kocberber O, et al. Clearing the clouds: A study of emerging scale-out workloads on modern hardware//Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems. London, UK, 2012:37-48.
  • 4Chen Tian-Shi, Guo Qi, Tang Ke, et al. ArchRanker: A ranking approach to design space exploration//Proceedings of the 41st Annual International Symposium on Computer Architecuture. Minneapolis, USA, 2014:85-96.
  • 5Wang Lei, Zhan Jian-Feng, Luo Chun-Jie, et al. BigDataBench: A big data benchmark suite from internet services/ /Proceedings of the 19th International Symposium on High Performance Computer Architecture. Orlando, USA, 2014:488-499.
  • 6Ghasemi H R, Kim N S. RCS: Runtime resource and core scaling for power-constrained multi-core processors// Proceedings of the 23rd International Conference on Parallel Architectures and Compilation. Edmonton, Canada, 2014: 251-262.
  • 7Nilmini A, Rcetuparna D, Li Qing-Kun, et al. Scaling towards kilo-core processors with asymmetric high-radix topologies//Proceedings of the 19th International Symposium on High Performance Computer Architecture. Shenzhen, China, 2013:496-507.
  • 8Guthmuller E, Leti C E A, Grenoble F, et al. Architectural exploration of a fine-grained 3D cache for high performance in a manycore context//Proceedings of the 21st International Conference on Very Large Scale Integration. lstanbul, Turkey, 2013, 302-307.
  • 9Kapil D, AbduUah N N, Sherief R. Power mapping and modeling of multi-core processors//Proceedings of the Inter- national Symposium on Low Power Electronics and Design. Beijing, China, 2013: 39-44.
  • 10Sironi F, Maggio M, Cattaneo R, et al. ThermOS, System support for dynamic thermal management of chip multi- processors//Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. Edinburgh, UK, 2013: 41-50.

二级参考文献93

  • 1Bell S, Edwards B, Amann Jet al. TILE64 processor: A 64-core SoC with mesh inter-connect//Proceedings of the International Solid-State Circuits Conference. San Francisco, USA, 2008:88-598.
  • 2Howard J, Dighe S, Hoskote Yet al. A 48-core IA-32 mes- sage-passing processor with DVFS in 45 nm CMOS//Pro- eeedings of the International Solid-State Circuits Conferenee. San Francisco, USA, 2010:108-109.
  • 3Kelm John H, Johnson Daniel R, Johnson Matthew R et al. Rigel: An architecture and sealable programming//Proeeed- ings of the International Symposium of Computer Arehitee-ture. Saint-Malo, France, 2009: 140-151.
  • 4Das S R, Fujimoto R, Panesar K S, Allison D, Hybinette M, GTW: A time warp system for shared memory multipro- eessors//Proceedings of the Winter Simulation Conference. Lake Buena Vista, USA, 1994:1332-1339.
  • 5Chen J, Annavaram M, Dubois M. SlaekSim: A platform for parallel simulations of CMPs on CMPs. ACM SIGARCH Computer Architecture News, 2009, 37(2): 20-29.
  • 6Miller J E, Kasture H, Kurian G et al. Graphite: A distributed parallel simulator for multieores//Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture. Bangalore, India, 2010:1-12.
  • 7Chiou D, Sunwoo D, Kim Jet al. FPGA-aeeelerated simula- tion technologies (FAST): Fast, full-system, cycle accurate simulators//Proeeedings of the 40th Annual IEEE/ACM In- ternational Symposium on Microarchiteeture. Porto Alegre, Brazil, 2007:249-261.
  • 8Fujimoto R M. Parallel discrete event simulation. Communi- cations of the ACM, 1990, 33(10) : 30-53.
  • 9Mukherjee S S, Reinhardt S, Falsafi Bet al. Wisconsin wind Tunnel II: A fast, portable parallel architecture simulator. IEEE Concurrency, 2000, 8(4): 12-20.
  • 10Chandy K, Misra J. Distributed simulation: A case study in design and verification of distributed programs. IEEE Trans- actions on Software Engineering, 1979, 5(5): 440-452.

共引文献708

同被引文献27

引证文献5

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部