

Optimization of FMM’s short range calculation with multi-GPU architecture
摘要 近几年,在高性能计算领域,GPU+CPU混合结构成为许多高性能计算机的主要结构,得到了广泛的应用。由于混合结构的特殊性,分析了传统的阿姆达尔定律,将其推广到混合结构中。针对FMM算法中近程计算部分在multi-GPU+CPU混合结构中存在的任务均衡以及通信延时等问题,在混合结构阿姆达尔定律的指导下,提出了多GPU调度模型和两级流水模型。该调度模型能够有效地进行多个GPU之间负载的均衡,缓解近程计算的非均匀性所带来的问题;同时,两级流水模型使CPU和GPU可以并行工作,通过计算和访存的重叠,来隐藏访存带来的延时问题,提高运算部件的利用率。实验验证和数据的比较证明了上述优化的可行性,该优化方案进一步加速了算法的执行。 Recent years, the hybrid architecture of GPU and CPU has become the main architecture of high performance computer.Considering the specificity of hybrid architecture, this paper analyzes traditional Amdahl' s law, and extends the Amdahl' s law to hybrid architecture. Under the guidance of Amdahl' s law, a multiple GPU scheduling model and two-level pipelining model are presented to balance the workload of each GPU and reduce the communication latency, which are two main problems in the short range calculation of FMM algorithm. The scheduling model can effectively balance workload of each GPU and relieve the affect caused by the non-uniform short range calculation. The two-level pipelining model enables CPU and GPU to work in parallel, so it compensates the memory access latency and improves the utilization rate. Experimental results prove that the presented methods are feasible and can speed up the algorithm.
出处 《计算机工程与应用》 CSCD 2013年第8期37-42,91,共7页 Computer Engineering and Applications
基金 国家自然科学基金(No.61001163) 上海市教育委员会科研创新项目(No.09YZ09)
关键词 混合结构 GPU 快速多极子算法(FMM) PetFMM 流水线 hybrid architecture GPU Fast Multipole Method(FMM) PetFMM pipelining
  • 相关文献


  • 1何红旗,陈海峰.高效能计算系统技术发展现状与趋势研究[J].新型计算结构与应用,2010(9) :21-24.
  • 2China grabs supercomputing leadership spot in latest rankingof world . s top 500 supercomputers[EB/OL].[2011-03-17],http://www.top500.org/lists/2010/li/press-release.
  • 3Cruz F A,Knepley M G,Barba L A.PetFMM-a dynami-cally load-balancing parallel fast multipole library[J].Interna-tional Journal for Numerical Methods in Engineering,2011,85(4):403-428.
  • 4Carrier J, Greengard L, Rokhlin V.A fast adaptive multipolealgorithm for particle simulations[J].SIAM Journal of Scien-tific and Statistical Computing, 1988,9:669-686.
  • 5Yokota R, Barba L,Berland K, et al.Treecode and fast multipolemethod for N-body simulation with CUDA[M]//GPU com-puting gems emerald edition.[S.1.].Morgan Kaufmann,2011.
  • 6Amdahl , s law[EB/OL].[2011 -03-17].http : //en.wikipedia.org/wiki/Amdahl'slaw.
  • 7曹旻,杨彩霞.FMM算法中问题规模与空间划分的关系分析[J].计算机工程与应用,2011,47(25):39-43. 被引量:2
  • 8Hamada T, Iitaka T.The chamomile scheme : an optimized algo-rithm for N-body simulations on programmable graphics pro-cessing units[EB/OL] . [2011 -03-17] .http : //arxiv.org/abs/astro-ph/0703100.
  • 9Gumerov N A, Duraiswami R.Fast multipole methods ongraphics processors[J].Journal of Computational Physics,2008,227(18):8290-8313.
  • 10Hamada T, Narumi T, Yokota R, et al.42 TFlops hierarchicalN-body simulations on GPUs with applications in both as-trophysics and turbulence[C]//Conference on High Perfor-mance Networking and Computing.New York:ACM,2009.


  • 1Che Shuai,Li Jie, Sheaffer J W,et al.Accelerating compute-inten- sive applications with GPUs and FPGAs[C]//Proc of the IEEE Symposium on Application Specific Processors(SASP) ,2008.
  • 2Ebert E E,Michael J Manton. Performance of satellite rainfall estimation algorithms during TOGA COARE[J]. J Atmos Sci,1998,35:1537-1557
  • 3Todd M,E C Barrett,M J Beaumont et al. Satellite identification of rain days over the upper Nile river basin using an optimum infrared rain/no-rain threshold temperature model[J]. J Appl Meteor,1995,34:2600-2611
  • 4Todd M,R Washington. A simple method to retrieve 3 hourly estimates of global tropical and subtropical precipitation from International Satellite Cloud Climatology Program (ISCCP) D1 data[J]. J Atmos Ocean Technol,1999,16:146-155
  • 5Herman A,V B Kumar,P A Arkin et al. Objectively determined 10-day African rainfall estimates created for famine early warning systems[J]. Int J Remote Sens,1997,18:2147-2159
  • 6Bellerby Tim,M Todd,D Kniveton et al. Rainfall estimation from a combination of TRMM precipitation radar and GOES multispectral satellite imagery through the use of an artificial neural network[J]. J Appl Meteor,2000,39:2115-2118
  • 7刘晓阳,毛节泰,李纪人,朱元竞.雷达联合雨量计估测降水模拟水库入库流量[J].水利学报,2002,33(4):51-55. 被引量:35









使用帮助 返回顶部