期刊文献+

基于计算缓存方法的分子动力学程序性能优化 被引量:3

Performance Optimization of a Molecular Dynamics Code Based on Computational Caching
下载PDF
导出
摘要 分子动力学数值模拟程序在现代高性能计算机上的计算效率往往很低,只能发挥系统峰值性能的几个百分点。本文对并行分子动力学程序PMD3D在联想深腾6800超级计算机上进行性能优化。通过性能分析,我们发现粒子相互作用力计算中相互关联的浮点运算严重影响了处理器的指令级并行效率,为此我们应用计算缓存的方法,将大量不规则的浮点计算进行缓存,达到一定规模后再进行向量化计算。这样使得单机性能在优化后提升4倍多,达到处理器峰值性能5.2GFlops的32.3%。最后,在深腾6800的64个节点的256个CPU上进行了并行性能测试,达到峰值运算性能1.3万亿次的27%。 The codes of molecular dynamics always run in a low performance manner, and achieve only several percents of the peak performance on modern supercomputers. In this paper, we optimize the program PMD3D on the Shenteng 6800 supercomputer. By performance analysis, we obtain that the dependence of the long-latency operations heavily influences the parallel efficiency of instructions. Based on the analysis, we optimize the codes, cache the amount irregular computations, and compute them vectorially. We achieve 32. 3% of the peak performance and 4×speedups on the Itanium 2 processor. Fi- nally, we test the program on the Shenteng 6800 with 256 CPUs in 64 nodes, and achieve 27% of the peak performance 1. 3TFlops.
出处 《计算机工程与科学》 CSCD 北大核心 2009年第11期77-79,83,共4页 Computer Engineering & Science
基金 国家自然科学基金资助项目(60873005 60603052)
关键词 分子动力学 性能优化 计算缓存 指令级并行 molecular dynamics performance optimization computational caching ILP
  • 相关文献

参考文献7

  • 1曹小林,莫则尧,张景琳,陈其峰.基于“块-单元”数据结构的分子动力学并行计算[J].计算物理,2004,21(5):377-385. 被引量:14
  • 2曹小林,莫则尧.一种基于实测的高维动态负载平衡方法[J].计算机学报,2005,28(9):1440-1446. 被引量:4
  • 3Fitch B G, Rayshubskiy A, Eleftheriou M, et al. Blue Matter: Strong Scaling of Molecular Dynamics on Blue Gene/L [R]. IBM Research Technical Report RC3688, 2005.
  • 4Kumar S, Huang G C, Kale L V. Achieving Strong Scaling with NAMD on Blue Gene/L[R]. PPL Technical Report 05- 13, University of Illinois at Urbana-Champaign, 2005.
  • 5Optimizing SAGE on the Intel Itanium 2 Processor-Based Platform[R]. Technical White Paper from Intel Solution Services, 2003.
  • 6Intel Corp. Introduction to Microarchitectural Optimization for Itanium 2 Processors [EB/OL]. [2009-06-12]. http:// www. intel. com/software/products/vtune/techtopic/software_optimization. pdf.
  • 7Intel Corp. Intel Itanium 2 Processor Reference Manual for Software Development and Optimization[EB/OL]. [2009-06-12]. http://www. intel. com/design/itanium2/manuals/ 251110. htm.

二级参考文献10

  • 1Sun Yu-Dong, Wang Cho-Li. Solving irregularly structured problems based on distributed object model. Parallel Computing, 2003, 29(11): 1539~1562.
  • 2Hendrickson B., Devine K.. Dynamic load balancing in computational mechanics. Computer Methods in Applied Mechanics and Engineering, 2000, 184(2): 485~500.
  • 3Pilkington R., Baden B.. Dynamic partitioning of non-uniform structured workloads with space-filling curves. IEEE Transactions on Parallel and Distributed Systems, 1996, 7(3): 288~299.
  • 4Baker J., Chrisochoides N. P.. An evaluation of a framework for the dynamic load balancing of highly adaptive and irregular parallel application. In: Proceedings of the ACM/IEEE SC2003 Conference, Phoenix, Arizona, 2003, 46~52.
  • 5Kale L., Skeel R., Bhandarkar M. et al.. NAMD2: Greater scalability for parallel molecular dynamics. Journal of Computational Physics, 1999, 151(1): 283~312.
  • 6Salomon D.. Data Compression: The Complete Reference. second edition. New York: Springer, 2000.
  • 7Mo Ze-Yao, Zhang Jing-Lin. Dynamic load balancing for short-range parallel molecular dynamics simulations. International Journal of Computer Mathematics, 2002, 79 (2): 165~177.
  • 8莫则尧,张景琳.二维分子动力学程序(MDP)的并行与优化[J].计算物理,2000,17(1):193-198. 被引量:6
  • 9莫则尧.一维高效动态负载平衡方法:多层均权法[J].计算机学报,2001,24(2):183-190. 被引量:10
  • 10阳述林,莫则尧,沈隆钧.基于几何区域分解的三维输运问题并行迭代算法[J].计算物理,2004,21(1):1-9. 被引量:6

共引文献15

同被引文献12

引证文献3

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部