期刊文献+

基于虚拟化的多GPU深度神经网络训练框架 被引量:10

Training Framework of Multi-GPU Deep Neural Network Based on Virtualization
下载PDF
导出
摘要 针对深度神经网络在分布式多机多GPU上的加速训练问题,提出一种基于虚拟化的远程多GPU调用的实现方法。利用远程GPU调用部署的分布式GPU集群改进传统一对一的虚拟化技术,同时改变深度神经网络在分布式多GPU训练过程中的参数交换的位置,达到两者兼容的目的。该方法利用分布式环境中的远程GPU资源实现深度神经网络的加速训练,且达到单机多GPU和多机多GPU在CUDA编程模式上的统一。以手写数字识别为例,利用通用网络环境中深度神经网络的多机多GPU数据并行的训练进行实验,结果验证了该方法的有效性和可行性。 Aiming at the problem of deep neural network speeding up training on distributed multi-machine and multi-GPU,this paper proposes an implementation method of remote multi-GPUs calls based on virtualization.The distributed GPU clusters deployed by remote GPU calls improve the traditional one-to-one virtualization technology and change the location of the deep neural network for parameter exchange during distributed multi-GPU training,achieve the compatibility between the two.The method utilizes the remote GPU resources in a distributed environment to speed up the training of deep neural networks,and reaches the unification of CUDA programming modes of single GPU and multi-GPU.Taking handwritten numeral recognition as an example,experiments are carried out on the parallel training of multi-GPU and multi-GPU data in the deep network of general network environment,results verify the effectiveness and feasibility of the method.
出处 《计算机工程》 CAS CSCD 北大核心 2018年第2期68-74,83,共8页 Computer Engineering
基金 国家重点研发计划项目"面向异构融合数据流加速器的运行时系统"(2016YFB1000403)
关键词 虚拟化 深度神经网络 分布式 多机多GPU 数据并行 手写数字识别 virtualization deep neural network distributed multi-machine and multi-GPU data parallel handwritten numeral recognition
  • 相关文献

参考文献5

二级参考文献43

  • 1吴恩华,柳有权.基于图形处理器(GPU)的通用计算[J].计算机辅助设计与图形学学报,2004,16(5):601-612. 被引量:227
  • 2陈勇,陈国良,李春生,何家华.SMP机群混合编程模型研究[J].小型微型计算机系统,2004,25(10):1763-1767. 被引量:19
  • 3刘兆春,李光辉,王庆国,柴守海.并行文件系统PVFS[J].信息技术,2005,29(4):108-109. 被引量:2
  • 4刘天华,朱宏峰,杜梅,常桂然.RDMA技术的研究与应用[J].沈阳师范大学学报(自然科学版),2006,24(2):185-188. 被引量:5
  • 5张舒,褚艳丽.GPU高性能运算之CUDA[M].北京:中国水利水电出版社,2010,124-137.
  • 6Diamos G,Yalamanchili S.Harmony:An Execution Model and Runtime for Heterogeneous Many Core Systems[C]//Proceedings of the 17th International Symposium on High Performance Distributed Com-puting.[S.l.]:ACM Press,2008:197-200.
  • 7Whiting P G,Pascoe R S V.A History of Data-flow Languages[J].IEEE Annals of the History of Computing,1994,16(4):38-59.
  • 8Keller R M.Data Flow Program Graphs[J].Computer,1982,15(2):26-41.
  • 9Dokulil J,Bajrovic E,Benkner S,et al.High-level Support for Hybrid Parallel Execution of C++Applications Targeting Intel Xeon Phi Coproc-essors[C]//Proceedings of International Conference on Computational Science.[S.l.]:Springer,2013.
  • 10Wu Yongwen,Song Junqiang,Lu Fengshun,et al.Communication and Memory Access Latency Character-istics of CPU/GPU Heterogeneous Cluster[C]//Proc-eedings of International Conference on Computational and Information Sciences.Chongqing,China:[s.n.],2012:958-961.

共引文献14

同被引文献92

引证文献10

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部