一种基于大同步并行编程模式的N体问题的优化实现

An Optimized Implementation of N-Body Problem Based on Maximal Synchronous Parallel Programming Model

导出

摘要文章基于多核机群系统对并行编程模型进行了深入研究,实现了多层次并行体系结构的Open MP/MPI混合编程模型的设计。在以SMP机群系统为背景的情况下,实现其节点间和节点内的分层,运用多层次的并行编程模型进行实验与分析。同时对多层次并行编程模型的性能进行深入的研究,提出了一种大同步混合设计新思路。设计了N-Body问题的大同步优化并行算法,并在曙光TC 5000A机群上与传统的并行算法作了性能方面的比较。通过理论研究并结合大量的实验分析统计,得到了多核机群的混合并行编程模型的性能优化的诸多结论。 This paper discusses hybrid programming paradigm and different implementation for the multi-core cluster, and designs an Open MP/MPI hybrid programming model. Considering the SMP cluster system, we design a hierarchical inter-nodes and hierarchical intra-nodes, and conduct a lot of experiments and analysis. The performance of parallel programming model for multi levels is studied, and a new hybrid design idea of maximal synchronous method is proposed. We design a maximal synchronous optimization parallel algorithm on the N-Body problem, and compare its performance with traditional hybrid parallel algorithms on the Dawning 5000 A cluster. Through theoretical research combined with statistical analysis on the basis of large number of experiments, we obtain many conclusions of performance optimization on multi-core cluster hierarchical parallel programming model.

作者祝永志王喜燕

机构地区曲阜师范大学信息科学与工程学院

出处《电子技术（上海）》 2015年第2期28-32,共5页 Electronic Technology

基金山东省自然科学基金(ZR2013FL015)山东省研究

关键词 N-BODY MPI OPENMP 大同步算法性能优化 N-Body MPI Open MP maximal synchronous programming performance optimization

分类号 TP391.13 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献9

1白秀秀,董小社,刘超,曹海军,李亮.面向异构多核架构的自适应编译框架[J].计算机学报,2014,37(7):1548-1559. 被引量：2
2Voronin K V. A numerical study of an MPI/OpenMP implementation based on asynchronous threads for a three-dimensional splitting scheme in heat transfer problems[J].Joumal of Applied and Industrial Mathematics,2014,8(3):436-443.
3刘志强,宋君强,卢风顺,赵娟.基于线程的MPI通信加速器技术研究[J].计算机学报,2011,34(1):154-164. 被引量：11
4Perla F, Zanetti R Performance analysis of an hybrid OpenMP/MPI ALM software for life insurance policies on multi-core architectures[C].8th International Workshop on OpenMP,2012: 250-253.
5江洁,凌思睿.一种投票式并行RANSAC算法及其FPGA实现[J].电子与信息学报,2014,36(5):1145-1150. 被引量：4
6Tsuji M, Sato M. Performance evaluation of OpenMP and MPI hybrid programs on a large scale multi-care multi-socket cluster, T2K Open Supercomputer[C]. 2009 International Conference on Parallel Processing Workshops, 2009.
7Miki Y, Takahashi D, Morid M, et al. Highly scalable implementation of an N-body code on a GPU cluster[J].Computer Physics Communications,2013(184):2159-2168.
8Capuzzo-Dolcetta R, Spera M, Punzo D. A fully parallel, high precision, N-body code running on hybrid computing platforms[J].Journal of Computational Physics, 2013(236): 580-593.
9祝永志,张丹丹,曹宝香,禹继国.基于SMP机群的层次化并行编程技术的研究[J].电子学报,2012,40(11):2206-2210. 被引量：9

二级参考文献52

1陈付幸,王润生.一种新的消失点检测算法[J].电子与信息学报,2006,28(8):1458-1462. 被引量：8
2孙红伟.二项分布两种近似计算的讨论[J].河南教育学院学报（自然科学版）,2007,16(1):28-29. 被引量：2
3Chai L, Gao Q, Panda D K. Understanding the impact of multi core architecture in cluster computing: A case study with InteI Dual Core system//Proceedings of the CCGrid'07. Rio de Janeiro, Brazil, 2007:471 -478.
4Tang H, Shen K, Yang T. Program transformation and runtime support for threaded MPI execution on shared memory machines. ACM Transactions on Programming Languages and Systems, 2000, 22(4): 673- 700.
5Demaine E D. A threads only MPI implementation for the development of parallel programs//Proceedings of the Ilth In ternational Symposium on High Performance Computing Sys terns. Winnipeg, Manitoba, Canada, 1997:153-163.
6Prakash S, Bagrodia R. MPI -SIM: Using parallel simulation to evaluate MPI programs//Proceedings of the Winter Simula tion. Los Aamitos, CA, USA, 1998:467- 474.
7Saini S, Naraikin A et al. Early performance evaluation of a Nehalem" cluster using scientific and engineering applications//Proceedings of the SC'09. New York, USA, 2009, Article 21,12 pages.
8Diaz Martin J C, Rico Gallego J A et al. An MPI -1 corn pliant thread based implementation//Proceedings o{ the EuroPVM/ MP1 2009. Berlin, Heidelberg, 2009:327- 328.
9Sade Y, Sagiv S, Shaham R. Optimizing C multithreaded memory management using thread local storage//Proceedings of the CC'05. Berlin, Heidelberg, 2005:137-155.
10Jin H W, Sur S, Chai L, Panda D K. LiMIC: Support for high-performance MPI Intra Node communication on Linux cluster//Proceedings of the ICPP'05. Washington, DC,USA, 2005, 184- 191.

共引文献21

1邹金安,刘志强,廖蔚.一种Nehalem平台上的MPI多级分段归约算法[J].小型微型计算机系统,2012,33(4):733-738.
2祝永志,张丹丹,曹宝香,禹继国.基于SMP机群的层次化并行编程技术的研究[J].电子学报,2012,40(11):2206-2210. 被引量：9
3李桂君,祝永志.基于多核集群系统的并行编程模型的研究与实现[J].电脑知识与技术,2013,9(4):2349-2352. 被引量：2
4马海峰,姚念民,杜文杰.基于不等长counter的存储器机密性和完整性保护方法[J].电子学报,2013,41(12):2503-2506.
5吕忠亭,张玉强,常慧,高雪.金属球电磁散射特性FDTD并行计算及性能分析[J].电子测量技术,2014,37(11):20-24. 被引量：2
6尹俊,董利达,迟天阳.基于特征点分类策略的移动机器人运动估计[J].计算机应用,2015,35(2):590-594. 被引量：1
7白文若,汪宁渤,朱均超,张宝峰.利用MPI构建柔性数据处理系统[J].计算机应用与软件,2015,32(9):38-41. 被引量：1
8庞业勇,王少军,彭宇,彭喜元.一种在线时间序列预测的核自适应滤波器向量处理器[J].电子与信息学报,2016,38(1):53-62. 被引量：2
9祝永志,续士强,禹继国.基于OpenMP/MPI并行编程模型的N体问题的优化实现[J].计算机工程与应用,2016,52(5):16-21. 被引量：1
10祝永志.多核SMP集群Hybrid并行编程模式的研究与分析[J].电子技术（上海）,2016,43(2):66-69. 被引量：1

1徐鹏,魏紫.N-Body问题在CUDA平台上并行实现研究[J].科技信息,2009(27):62-62.
2王小伟,郭力,杨章远.N-body算法及其并行化[J].计算机与应用化学,2003,20(1):195-200. 被引量：7
3付俊辉,王静,庞峥元.基于GPU集群的并行编程模型研究[J].高性能计算技术,2012,0(1):17-21.
4王伟,曾栩鸿,王福焕,傅丽丽,曾国荪.并行时空处理模型下的快速N-body算法[J].计算机科学与探索,2011,5(11):1006-1013. 被引量：3
5邵碧华,王飞.MIS软件设计的新思路[J].华南金融电脑,2002,10(7):82-83.
6杨磊,徐蓉萍.一种分布式控制系统设计新思路[J].冶金自动化,2003,27(1):73-73. 被引量：1
7苹果设计新思路[J].数字技术与应用,2005(10):14-14.
8傅丽丽,曾国荪.N体问题的FPGA求解和设计方法[J].计算机科学,2010,37(11):302-306.
9牟磊.基于Barnes Hut算法的N-body问题模拟[J].福建电脑,2010,26(8):115-116.
10唐振,张倬,柴亚辉,徐炜民.FMM算法在Cell/B.E.处理器上实现的分析与验证[J].计算机工程与科学,2011,33(8):79-83.

电子技术（上海）

2015年第2期

浏览历史

内容加载中请稍等...

一种基于大同步并行编程模式的N体问题的优化实现

参考文献9

二级参考文献52

共引文献21

相关作者

相关机构

相关主题

浏览历史