期刊文献+

异构平台中并行矩量法的加速技术 被引量:1

Acceleration for the Parallel MoM Using GPU
下载PDF
导出
摘要 本文主要研究了在CPU/GPU异构集群上的并行矩量法的加速技术。本文设计出一种MPI/CUDA软件编程架构,解决了CPU/GPU异构集群上并行LU分解跨节点计算的难题。此架构基于矩阵分块二维循环分布的数据分配策略,利用MPI实现计算节点之间的通信,同时利用GPU加速矩阵更新过程。为了突破GPU显存对LU分解的矩阵规模的限制,本文进一步研究了"显存—内存"核外算法。为了优化算法性能,本文提出了基于"CUDA流"技术和"异步通信"技术的设计方案,实现了GPU通信与计算的重叠,有效隐藏了GPU通信时间,获到了明显的加速效果。 The acceleration technique for the parallel Mo M on CPU/GPU hybrid system platform is studied. In this paper, based on the parallel data distribution scheme of matrix blocked 2-D circle, the MPI/CUDA software program architecture is designed,which uses MPI to achieve the internal communication and GPU to accelerate the matrix updates process. So the bottleneck of across nodes parallel LU factorization on CPU/GPU hybrid cluster is broken up. In order to overcome the restriction of GPU memory to the matrix scale factorized, the 'GPU memory-CPU memory' out-of-core technique is introduced. In order to optimize the performance of this algorithm, the designing scheme based on 'CUDA stream' and 'asynchronous communication' technologies is proposed which contributes to the overlap of GPU communication with computation, so the GPU communication time is hided and the obviously speedup is obtained.
出处 《微波学报》 CSCD 北大核心 2014年第S1期51-54,共4页 Journal of Microwaves
关键词 矩量法 异构平台 GPU加速 并行 核外 隐藏通信 MoM hybrid system platform GPU acceleration parallel out-of-core hiding communication
  • 相关文献

参考文献1

二级参考文献10

  • 1CHEN Mingsheng WU Xianliang HUANG Zhixiang SHA Wei.Chebyshev Approximation for Fast Frequency- Sweep Analysis of Electromagnetic Scattering Problem[J].Chinese Journal of Electronics,2006,15(4):736-738. 被引量:13
  • 2Sanders J,Kandrot E.GPU高性能编程CUDA实战[M].聂雪军,译.北京:机械工业出版社,2011.
  • 3Harrington R F. Field computation by moment method [ M ]. New York : Macmillan, 1968.
  • 4Song Jun Park. An Analysis of GPU Parallel Computing [ A ]. DoD High Performance Computing Modernization Program Users Group Conference[ C] ,2009.365-369.
  • 5NVIDIA Comporation Technical Staff. NVIDIA CUDA programming guide version 3.2 [ Z ]. USA : NVIDIA Cor- poration, 2008.
  • 6Tomasz Topa, Andrzej Karwowski, Artur Noga. Using GPU with CUDA to accelerate MoM-based electromagnet- ic simulation of wire-grid models [ J ]. Antennas and Wireless Propagation Letters,2011 (10) :342-345.
  • 7Topa T, Noga A, Karwowski A. Adapting MoM with RWG basis functions to GPU technology using CUDA [ J ]. Antennas and Wireless Propagation Letters, 2011 (10) :480-483.
  • 8Shao X P, Nie Z P. Acceleration of the method of mo- ments calculations by using graphics processing units [ J]. IEEE Transactions on Antennas and Propagation, 2008,7:2130-2133.
  • 9杜子静,张玉,赵勋旺,梁昌洪.并行高阶矩量法分析舰队RCS和其它电磁特性[J].微波学报,2011,27(4):53-56. 被引量:3
  • 10张庆科,杨波,王琳,朱福祥.基于GPU的现代并行优化算法[J].计算机科学,2012,39(4):304-310. 被引量:27

共引文献1

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部