期刊文献+

面向节点异构GPU集群的编程框架 被引量:3

Programming Framework for Node Heterogeneous GPU Cluster
下载PDF
导出
摘要 基于异构GPU集群的主流编程方法是MPI与CUDA的混合编程或者其简单变形。因为对底层的集群架构不透明,程序员对GPU集群采用MPI与CUDA编写应用程序时需要人为考虑硬件计算资源,复杂度高、可移植性差。为此,基于数据流模型设计和实现面向节点异构GPU集群体系结构的新型编程框架分布式并行编程框架(DISPAR)。DISPAR框架包含2个子系统:(1)代码转换系统Stream CC,是DISPAR源代码到MPI+CUDA代码的自动转换器。(2)任务分配系统Stream MAP,具有自动发现异构计算资源和任务自动映射功能的运行时系统。实验结果表明,该框架有效简化了GPU集群应用程序的编写,可高效地利用异构GPU集群的计算资源,且程序不依赖于硬件平台,可移植性较好。 The mainly used programming method for heterogeneous GPU cluster is hybrid MPI/CUDA or its simple deformation.However,because of its transparency to underlying architecture when using hybrid MPI/CUDA to write code for heterogeneous GPU cluster programmers tend to need detailed knowledge of the hardware resources,which makes the program more complicated and less portable.This paper presents Distributed Parallel Programming Framework(DISPAR),a new programming framework for node-level heterogeneous GPU cluster based on data flow model.DISPAR framework contains two sub-systems,StreamCC and StreamMAP.StreamCC is a code conversion tool which coverts DISPAR code into hybrid MPI/CUDA code.StreamMAP is a run-time system which can detect heterogeneous computing resources and map the tasks to appropriate computing units automatically.Experimental results show that the methods can make efficient use of the computing resources and simplify the programming on heterogeneous GPU cluster.Besides,it has better portability and scalability as the code does not rely on the execution platform.
出处 《计算机工程》 CAS CSCD 北大核心 2015年第2期292-297,共6页 Computer Engineering
基金 复旦大学ASIC和系统国家重点实验室基金资助项目 华为创新研究计划基金资助项目
关键词 GPU集群 异构 分布式并行编程框架 代码转换 任务分配 可移植性 GPU cluster heterogeneous Distributed Parallel Programming Framework(DISPAR) code conversion task assignment portability
  • 相关文献

参考文献12

  • 1Diamos G,Yalamanchili S.Harmony:An Execution Model and Runtime for Heterogeneous Many Core Systems[C]//Proceedings of the 17th International Symposium on High Performance Distributed Com-puting.[S.l.]:ACM Press,2008:197-200.
  • 2Whiting P G,Pascoe R S V.A History of Data-flow Languages[J].IEEE Annals of the History of Computing,1994,16(4):38-59.
  • 3Keller R M.Data Flow Program Graphs[J].Computer,1982,15(2):26-41.
  • 4Dokulil J,Bajrovic E,Benkner S,et al.High-level Support for Hybrid Parallel Execution of C++Applications Targeting Intel Xeon Phi Coproc-essors[C]//Proceedings of International Conference on Computational Science.[S.l.]:Springer,2013.
  • 5王惠春,朱定局,曹学年,樊建平.基于SMP集群的混合并行编程模型研究[J].计算机工程,2009,35(3):271-273. 被引量:15
  • 6陈勇,陈国良,李春生,何家华.SMP机群混合编程模型研究[J].小型微型计算机系统,2004,25(10):1763-1767. 被引量:19
  • 7Wu Yongwen,Song Junqiang,Lu Fengshun,et al.Communication and Memory Access Latency Character-istics of CPU/GPU Heterogeneous Cluster[C]//Proc-eedings of International Conference on Computational and Information Sciences.Chongqing,China:[s.n.],2012:958-961.
  • 8Kindratenko V V,Enos J J,Shi Guochun,et al.GPU Clusters for High-performance Computing[C]//Proceedings of IEEE International Conference on Cluster Computing.[S.l.]:IEEE Press,2009:1-8.
  • 9许彦芹,陈庆奎.基于SMP集群的MPI+CUDA模型的研究与实现[J].计算机工程与设计,2010,31(15):3408-3412. 被引量:10
  • 10滕人达,刘青昆.CUDA、MPI和OpenMP三级混合并行模型的研究[J].微计算机应用,2010,31(9):63-69. 被引量:9

二级参考文献19

  • 1陈勇,陈国良,李春生,何家华.SMP机群混合编程模型研究[J].小型微型计算机系统,2004,25(10):1763-1767. 被引量:19
  • 2张锦雄.矩阵相乘并行算法的MPI实现[J].广西科学院学报,2004,20(4):217-219. 被引量:3
  • 3赵永华,迟学斌.基于SMP集群的MPI+OpenMP混合编程模型及有效实现[J].微电子学与计算机,2005,22(10):7-11. 被引量:33
  • 4Rabenseifner R. Hybrid Parallel Programming on HPC Platforms[C]//Proc. of the 5th European Workshop on OpenME Aachen, Germany: [s. n.], 2003.
  • 5Valentina E Antonio L, Gabriel G,et al. Parallelism and Granularity in Time Dependent Approaches to Reactive Scattering Calculations[C]//Proc. of the Int'l Conf. on Parallel and Distributed Processing Techniques and Applications. Las Vegas, Nevada, USA: [s. n.], 2000.
  • 6[1]Chen Guo-liang, WU Jun-min and etc. Parallel computer architecture[M]. Beijing: Higher Education Press, 2002.
  • 7[2]Rajkumar Buyya. High performance cluster computing[M]. Beijing: Publishing House of Electronic Industry, 2001.
  • 8[3]Tanaka Y, Matsuda M,Ando M, Kazuto K and Sato M. Compas: a pentium pro PC-based SMP cluster and its experience[J]. IPPS Workshop on Personal Computer Based Networks of Workstations. 1998, 486-497.
  • 9[4]Lusk E L, Gropp W W. A taxonomy of programming models for symmetric multiprocessors and SMP clusters[C]. Proceedings of Programming Models for Massively Parallel Computers. 1995, 2-7.
  • 10[5]Chen Yong, Chen Guo-liang, Xu Yin-long and Shan Jiu-long. Implementation and evaluation of MPI+OpenMP programming model on Dawning3000[C]. Proceedings of the 21st IASTED International Conference. Calgary: ACTA Press. 2003, 732-737.

共引文献43

同被引文献14

引证文献3

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部