期刊文献+

基于神威太湖之光架构的LOBPCG并行算法研究 被引量:1

OPTIMIZE A PRECONDITIONED BLOCK ITERATIVE EIGENSOLVER ON SUNWAY MACHINE
原文传递
导出
摘要 LOBPCG是一种适合大规模稀疏对称问题的特征值数值解法.本文研究了适合神威太湖之光架构的LOBPCG并行算法.首先提出了基于主、从核的混合并行模型;研究了稀疏矩阵-向量积的并行算法,通过核组间通信隐藏、核组内通信隐藏等技术提高程序速度,并提出一种自动调节从核缓冲数据量的算法,可自动逼近最佳的通信隐藏效果;研究了稠密矩阵积在神威太湖之光架构上的并行算法,针对不同“形态”的输入矩阵提出了不同的矩阵分割算法,速度显著优于其它算法库;在计算最高1.25亿阶矩阵、使用936000计算核心的特征值求解测试中表现出良好的扩展性.我们还测试了该应用在凝聚态物理领域的强关联系统中的性能. LOBPCG is a numerical method to solve sparse matrix eigenvalue problem.In this paper,methods of optimization details are discussed cover all the main computations of LOBPCG.The parallel model of data and computation for Sunway machine is proposed,what follows is an effective parallel algorithm of sparse matrix-vector product,which implemented with automatically optimized data buffer strategy;then a research of the parallel algorithm of dense matrix multiplication adapted to the Sunway architecture is illustrated,this work get significant promotion.We test this implementation with up to 1 million cores on Sunway Machine.
作者 于天禹 赵永华 赵莲 Yu Tianyu;Zhao Yonghua;Zhao Lian(Computer Network Information Center^Chinese Academy of Sciences,University of Chinese Academy of Sciences,Beijing 100190,China;Computer Network Information Center,Chinese Academy of Sciences Beijing 100190,China)
出处 《数值计算与计算机应用》 2019年第4期291-309,共19页 Journal on Numerical Methods and Computer Applications
基金 国家重点研发计划“高性能计算应用软件协同开发工具与环境研究”(2017YFB0202202),国家重点研发计划高性能计算专项(2016YFB0201302)
关键词 LOBPCG 特征值 神威太湖之光 并行算法 LOBPCG eigensolver Sunway parallel algor计hm
  • 相关文献

参考文献4

二级参考文献45

  • 1Feng Y T, Owen D R J. Conjugate Gradient Methods for Solving the Smallest Eigenpair of Large Symmetric Eigenvalue Problems[J]. Internat. J. Numer. Methods Engrg., 1996, 39(13): 2209-2229.
  • 2Lanczos C. An iteration method for the solution of eigenvalue problems of linear differential and integral operators[J]. J. Res. Nat. Bureau Standards, 1950, 45: 255-282.
  • 3Wilkinson J H. The Algebraic Eigenualue Problem[M]. Oxford University Press, London, 1965.
  • 4Susumu Yamada, Toshiyuki Imamura, Masahiko Machida. 16.14 TFLOPS Eigenvalue Solver on the Earth Simulator: Exact Diagonalization for Ultra Largescale Hamiltonian Matrix. ISHPC 2005, 402-413.
  • 5Knyazev A V. A preconditioned conjugate gradient method for eigenvalue problems and its imple- mentation in a subspace, in Eigenwertaufgaben in Natur- und Ingenieurwissenschaften und ihre numerische Behandlung, Oberwolfach, 1990, Internat. Set. Numer. Math. 96, Birkhauser, Basel, 1991, 143-154.
  • 6Lashuk I, Argentati M, Ovtchinnikov E. Preconditioned Eigensolver LOBPCG in hypre and PETSc[J]. Lecture Notes in Computational Science and Engineering, 2007, 55: 635-642.
  • 7Bergamaschi L, Pini G, Sartoretto F. Parallel preconditioning of a sparse eigensolver[J]. Parallel Computing, 2001, 27(7): 963-976.SIAM J. Sci. Comput., 2001, 23(2): 517-541.
  • 8Knyazev A V. Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Precondi- tioned Conjugate Gradient Method. Technical Report UCD-CCM 149, Center for Computational Mathematics, University of Colorado, Denver, 2000.
  • 9Bergamaschi L, Pini D. Approximate inverse preconditioning in the parallel solution of sparse eigenproblems[J]. Numerical Linear Algebra with Applications, , 2000, 7(3): 99-116.
  • 10Basserman A. Parallel sparse matrix computations in iterative solvers on distributed memory machines[J]. J. Parallel Distrib. Comput., 1997, 45: 46-52.

共引文献7

同被引文献9

引证文献1

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部