期刊文献+

基于异构平台的BH算法高效并行实现 被引量:1

High efficient parallel implementation of BH algorithm on heterogeneous platforms
下载PDF
导出
摘要 针对多核CPU和众核加速器或协处理器异构平台的架构特征进行了研究,以MPI和Open MP混合编程模型实现了N体问题BH算法的并行,采用了正交递归二分法(ORB)使进程之间负载均衡,并对程序进行了并行优化和MIC加速。优化和加速后的程序性能提升到原版本的3.4倍以上,其中MIC加速后性能提升到加速前的1.7倍;程序具有较好的扩展性,计算粒子规模达到上亿时,可扩展到32个节点共4 480核心(640个CPU核心和3 840个MIC核心)。 Studying the architecture' s characteristics of the multi-core CPU and accelerators or coprocessors heterogeneous platforms, this paper was about the parallel implementation of N-body BH algorithm with hybrid MPI and OpenMP programming model. It used orthogonal recursive bisection (ORB) to balance load between processors, then carefully optimized the code on multi-core CPU and accelerated it on MIC. Testing result shows, after optimizing and accelerating, the code' s performance rea- ches above 3.4x speedup than original version and gets a 1.7x speedup than only running on muhi-core CPU. The code also has a good scalability with a 100 million particles running on a 32 nodes cluster, which has 4 480 cores (640 CPU cores and 3 840 MIC cores).
出处 《计算机应用研究》 CSCD 北大核心 2016年第8期2255-2259,共5页 Application Research of Computers
基金 国家自然科学基金青年基金资助项目(11301506)
关键词 N体问题 BH算法 异构平台 并行计算 N-body problem BH algorithm heterogeneous platforms parallel computing
  • 相关文献

参考文献17

  • 1Barnes J, Hut P. A hierarchical 0( N log N) force-calculation algo- rithm[ J]. Nature, 1986,324:446-449.
  • 2Greengard L,Rokhlin V. A fast algorithm for particle simulations[ J]. Journal of Computational Physics, 1997,135 (2) :280- 292.
  • 3Bode P, Ostriker J P. Tree particle-mesh : an adaptive, efficient, and parallel code for collision less cosmological simulation [ J ]. The As- trophysical Journal Supplement Series,2003,145 ( 1 ) : 1-13.
  • 4Warren M S. 2HOT: an improved parallel hashed oct-tree n-body algo- rithm for cosmological simulation [ C ]//Proc of International Confe- rence for High Performance Computing, Networking, Storage and Anal- ysis. New York:ACM Press,2013:72.
  • 5李琪刚,柴亚辉,徐炜民,郑衍衡.多体问题FMM算法在加速部件FPGA研究与实现[J].计算机工程与设计,2011,32(10):3391-3394. 被引量:4
  • 6TOP500. org. TOPIO[ EB/OL]. (2014-11-30). http://www, top500. org/lists/2014/11/.
  • 7Borovska P, Ivanova D. Code optimization and scaling of the astrophy- sics software gadget on Iutel Xeon Phi[ C]//Partnership for Advanced Computing in Europe. 2014 : 136-143.
  • 8Lange B, Fortin P. Parallel dual tree traversal on multi-core and many- core architectures for astrophysical N-body simulations [ C ]//Proc of the 20th International Conference on Parallel Processing. [ S. 1. ] : Springer International Publishing, 2014:716 -727.
  • 9Dehnen W. A hierarchical O (N) force calculation algorithm [ J ]. Journal of Computational Physics ,2002,179 ( 1 ) :27-42.
  • 10Vladimirov A, Karpusenko V. Test-driving Intel Xeon PhiTM copro- eeasors with a basic N-body simulation [ EB/OL ]. ( 2013-01-07 ). ht- tp ://www. goparallet, sourceforge, net/wp-content/uploads/2013/01 /.

二级参考文献15

  • 1赖国明,杨圣云,袁德辉.FMM算法的并行化方法[J].计算机应用与软件,2007,24(7):176-178. 被引量:2
  • 2Bill Dally. Life after moore's law [DB/OL] .http://www.forbes. corn/2010/04/29/moores-law-computing-processing-opinions- contributors-bill-dally_2.html,2010-04-29/2010-07-20.
  • 3Matsuo K,Hamada T, Miyoshi M,et al.Accelerating phase corre- lation functions using GPU and FPGA [C]. Proceedings of the NASA/ESA Conference on Adaptive Hardware and Systems, 2009:433-438.
  • 4Kapre N,DeHon A.Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors[C].International Conference on Field Programmable Logic and Applications,2009:65-72.
  • 5Storaasli O,Strenski D.Exploring accelerating science applica- tions with FPGAs[R].The Reconfigurable Systems Summer In- stitute,2007.
  • 6Tarek EI-Ghazawi, Esam El-Araby, Miaoqing Huang, et al. The promise of high-performance reconfigurable computing [J].Computer,2008,41 (2):69-76.
  • 7Greengard L,Rokhlin V.A fast algorithm for particle simulations [J].Journal of Computational Physics, 1987,73(2):325-348.
  • 8Wain R,Bush I,Guest M,et al.An overview of FPGAs and FPGA programming JR], Initial Experiences at Daresbury, Computa- tional Science and Engineering Department,CCLRC Daresbury Laboratory, 2006.
  • 9Pico Computing Inc,Impulse Accelerated Technologies Inc.Soft- ware-to-FPGA workstation unveiled[DB/OL].http://www, hpcwire. com/offthewire/Software-to-FPGA Workstation Unveiled. html? page=1,2008-10-16/2010-07-25.
  • 10Pico Computing Inc. FPGA cluster accelerates bioinformatics application by 5000X[DB/OL].http://www.hpcwire.com/offthe- wire/FPGA-Cluster-Accelerates-Bioinformatics-Application- by-5000X-69612762.html,2009-11-09/2010-07-25.

共引文献3

同被引文献3

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部