期刊文献+

多核编程模型运行时环境的自适应性研究 被引量:3

On Adaptability of the Runtime Environment for Emerging Multi-Core Programming Models
下载PDF
导出
摘要 针对多核编程模型运行时环境易造成处理器核资源竞争加剧以及可扩展性较差等弊端,基于动态反馈控制思想,将资源分配、运行时控制、任务执行视为有机整体,提出了自适应协同调度模型ACSM.ACSM采用集中式与分布式相结合的协同机制,动态调节处理器核资源在不同应用负载间及其内部的分配与管理.ACSM的优势在于充分体现了多核编程模型良好的可编程性和可移植性,消除了传统多核运行时环境显式指定核数的弊端,增强了处理器核资源分配的高效性和自适应性.实验结果表明,ACSM在提高多核编程模型易用性的同时,减少了系统处理器核资源的不良竞争,提升了系统的整体性能和资源利用率.与仅依赖多核编程模型运行时环境的调度算法相比,ACSM使应用程序的运行时间缩短了近50%,并且随着应用程序数量的增加效果更加显著. The adaptability and collaboration of the multi-core runtime system is studied to address the problems that the current multi-core runtime can easily lead to intensified competition for processor resources and the system scalability is inferior. An adaptive and collaborative scheduling model, named ACSM, is presented based upon the dynamic feedback-control principle by taking resource allocation, runtime control, and task execution as a holistic system. The ACSM dynamically reallocates and manages processor resources among and within workloads in both centralized and distributed manners. The superiorities of ACSM over the current multi-core runtime system are as follows. The ACSM maintains good programmability and portability, enhances efficiency and adaptability in processor resources allocation, and eliminates the need of explicitly specifying the number of cores. The experiment results show that ACSM greatly reduces the competition of processor resources and improves both the overall system performance and the usability of the current multi-core programming models. Comparisons with the scheduling algorithm that relies only on the original multi-core runtime show that applications of ACSM reduce the run time by about 50% or even more, especially when the system load increases.
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2011年第6期130-134,共5页 Journal of Xi'an Jiaotong University
基金 国家自然科学基金资助项目(61073011) 国家高技术研究发展计划资助项目(2009AA01A135 2009AA01A13) 中意国际合作项目(2009DFA12110)
关键词 多核 编程模型 运行时环境 协同调度 multi-core programming model runtime system collaborative scheduling
  • 相关文献

参考文献9

  • 1HILL M, MARTY M. Amdahl's law in the multicore era [J]. Computer, 2008, 41(7):33-38.
  • 2易会战,刘永鹏.改善系统能量效率的体系结构方法:并行处理[J].计算机学报,2009,32(12):2475-2481. 被引量:5
  • 3CHAPMAN B, HUANG Lei. Enhancing OpenMP and its implementation for programming multicore systems [M]//Parallel Computing: Architectures, Algorithms, and Applications. Amsterdam, Netherlands: IOS Press, 2008 : 3-18.
  • 4REINDERS J. Intel threading building blocks: outfitting C++ for multi-core processor parallelism [M]. Sebastopol, CA, USA: O'Reilly Media, 2007: 133- 168.
  • 5FRIGO M, LEISERSON C E, RANDALL K H. The implementation of the Cilk-5 multithreaded language [C] // Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation.New York, USA. ACM, 1998: 212-223.
  • 6龙国平,张军超,范东睿.众核体系结构对Cilk语言的硬件支持及评测研究[J].计算机学报,2008,31(11):1975-1985. 被引量:7
  • 7BIENIA C, KUMAR S, SINGH J P, et al. The PAR- SEC benchmark suite: characterization and architectural implications [C] // Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. New York, USA: ACM, 2008: 72-81.
  • 8AGRAWAL K, LEISERSON C E, SUKHA J. Executing task graphs using Work-Stealing [C]//Proceedings of 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS). Piscataway, NJ, USA: IEEE, 2010: 1-12.
  • 9AGRAWAL K, HE Y, LEISERSON C E. Adaptive work stealing with parallelism feedback [C]// Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). New York, USA:ACM, 2007: 112-120.

二级参考文献40

  • 1Wentzlaff D, Griffin P, Hoffmann H, Bao L, Edwards B, Ramey C, Mattina M, Miao C C, Brown J F, Agarwal A. On-chip interconnection architecture of the Tile processor. IEEE Micro, 2007, 27(5): 15-31
  • 2Tan G, Fan D, Zhang J, Russo A, Gao G R. Experience on optimizing irregular computation for memory hierarchy in manycore architecture//Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. Salt Lake City, Utah, USA, 2008: 279-280
  • 3Long G P, Fan D R, Zhang J C, Song F L, Yuan N, Lin W. A performance model of dense matrix operations on manycore architectures//Proceedings of the European Conference on Parallel and Distributed Computing. 2008:120-129
  • 4Lamport L. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, 1979, 28(9): 690-691
  • 5Adve S V, Gharachorloo K. Shared memory consistency models: A tutorial. IEEE Computer, 1996, 29(12): 66-76
  • 6Lenoski D, Laudon J, Gharachorloo K, Gupta A, Hennessy J L. The directory-based cache coherence protocol for the DASH multiprocessor//Proceedings of the International Symposium on Computer Architecture. Seattle, WA, USA, 1990: 148-159
  • 7Iftode L, Singh J P, Li K. Scope consistency: A bridge between release consistency and entry consistency. Theory Computing Systems, 1998, 31(4): 451-473
  • 8胡伟武.共享存储体系结构.北京:高等教育出版社,2001
  • 9Frigo M, Leiserson C E, Randall K H. The implementation of the Cilk-5 mnltithreaded language//Proceedings of the International Symposium on Programming Languages Design and Implementation. Montreal, Canada, 1998:212-223
  • 10Blumofe R D, Leiserson C E. Scheduling multithreaded computations by work stealing//Proceedings of the Annual IEEE Symposium on Foundations of Computer Science. Santa Fe, New Mexico, 1994: 256-368

共引文献10

同被引文献14

  • 1Kasanovic. The parallel computing laboratory at U.C.Berkeley:A research agenda based on the berkeley view[R].Berkeley:UCB,2008.1-25.
  • 2LIU Duo,SHAO Zili,WANG Meng. Optimal loop parallelization for maximizing iteration-level parallelism[J].IEEE Transactions on Paralld and Distributed Systems,2012,(03):564-572.
  • 3Hill M D,Marty M R. Amdahl's law in the multicore era[J].Computer,2008,(07):33-38.
  • 4W Hwu,S Ryoo,SZ Ueog. Implicitly parallel programming models for thousand-core microprocessors[A].San Diego,CA,USA:ACM,2007.754-759.
  • 5ZHANG Wangyuan,FU Xin,LI Tao. An analysis of microarchitecture vulnerability to soft errors on simultaneous multithreaded architectures[A].San Jose,CA,USA:IEEE,2007.169-178.
  • 6Balakrishnan S,Sohi G S. Program demultiplexing:Data-flow based speculative parallelization of methods in sequential programs[A].Boston,MA,USA:IEEE,2006.302-313.
  • 7Ben Lee. Pertormance evaluation of dynamic speculative multithreading with the cascadia architecture[J].IEEE Transactions on Parallel and Distributed Systems,2010,(01):47-59.
  • 8Bridges M J,Vachharajani N,ZHANG Y. Revisiting the sequential programming model for multi-core[A].Chicago,IL,USA:IEEE,2007.69-84.
  • 9Tian C,Feng M,Nagarajan V. Copy or discard execution model for speculative parallelization on multicores[A].Lake Como,Italy:IEEE,2008.300-341.
  • 10伊君翰.基于多核处理器的并行编程模型[J].计算机工程,2009,35(8):62-64. 被引量:13

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部