期刊文献+

迭代空间交错条块并行Gauss-Seidel算法 被引量:5

Iterative Space Alternate Tiling Parallel Gauss-Seidel Algorithm
下载PDF
导出
摘要 针对并行GS(Gauss-Seidel)迭代算法中数据局部性差、同步和通信开销大的问题,首先改进传统GS迭代,提出了多层对称GS迭代算法.然后给出了以迭代空间条块序作为执行序的串行执行模型.该模型通过对迭代空间进行"时滞"划分,对迭代空间条块内部多次迭代计算,提高算法的数据局部性.最后提出一种基于迭代空间条块的并行执行模型.该模型改进了迭代空间网格划分,并通过网格条块重排序减少了cache缺失率、通信启动和同步次数.实验结果表明,迭代空间交错条块并行算法比传统的区域分解方法和红黑排序并行算法具有更好的并行效率和可扩展性. In order to optimize data locality, communication and synchronization overhead, this paper proposes a multi-layers symmetric Gauss-Seidel method. Then the serial execution model of this iterative method is given, which introduces the sequence of iterative space tile as the sequence of execution, and divides iteration space by time skewing. In this model, nodes of the tile can be updated many times to improve data locality. The parallel GS execution model based on iteration space tiling is presented, which uses an improved iteration space partition algorithm and reorders the tiles of iteration space to reduce cache misses, communication and synchronization cost. Finally the numerical results are presented to confirm the effectiveness of Gauss-Seidel parallelized with alternate tiling method, specifically compared with owner-computing and red-black Gauss-Seidel methods, and show that the new parallel iterative method has better parallel efficiency as well as scalability.
出处 《软件学报》 EI CSCD 北大核心 2008年第6期1274-1282,共9页 Journal of Software
基金 Supported by the National Natural Science Foundation of China under Grant No.60373008(国家自然科学基金) the National High-Tech Research and Development Plan of China under Grant No.2006AA01Z105(国家高技术研究发展计划(863)) the Key Project of the Ministry of Education of China under Grant No.106019(国家教育部科学技术研究重点项目)
关键词 Gauss-Seidel算法 交错网格条块 数据局部性 通信优化 Gauss-Seidel algorithm alternate tiling data locality communication optimization
  • 相关文献

参考文献15

  • 1Saad Y. Iterative methods for sparse linear systems. 2nd ed., Philadelphia: SIAM, 2003.
  • 2Strout MM, Carter L, Ferrante J, Kreaseck B. Sparse tiling for stationary iterative methods, Int'l Journal of High Performance Computing Applications, 2004,18(1):95-114.
  • 3Zhang C, Lan H, Ye Y, Estrade BD. Parallel SOR iterative algorithms and performance evaluation on a linux cluster. In: Proc. of the Int'l Conf. on Parallel and Distributed Processing Techniques and Applications. Las Vegas: CSREA Press, 2005. 1042-1048,
  • 4Xie D. A new block parallel SOR method and its analysis. SIAM Journal on Scientific Computing, 2006,27:1513-1533.
  • 5Xie D. New block parallel SOR methods by multi-type partitions. In: Proc. of Int'l Conf. on Parallel Processing Workshop. Montreal: IEEE Computer Society Press, 2004. 165-172.
  • 6Tavakoli R, Davami P. New stable group explicit finite difference method for solution of diffusion equation. Applied Mathematics and Computation, 2006,181 (2): 1379-1836.
  • 7Tavakoli R, Davami P. A new parallel Gauss-Seidel method based on alternating group explicit method and domain decomposition method. Applied Mathematics and Computation, 2007,188(1):713-719.
  • 8Wallin D, Lof H, Hagersten E, Holmgren S. Multigrid and gauss-seidel smoothers revisited: parallelization on chip multiprocessors. In: Proc. of the 20th Annual Int'l Conf. on Supercomputing. Cairns: ACM Press, 2006. 145-155.
  • 9McCalpin S, Wonnacott D. Time skewing: A value-based approach to optimizing for memory locality. Technical Report, DCS-TR-379, Department of Computer Science, Rugers University, 1999.
  • 10Wonnacott D. Time skewing for parallel computers. In: Proc. of the 12th Int'l Workshop on Languages and Compilers for Parallel Computing. London: Springer-Verlag, 1999. 477-480.

同被引文献26

引证文献5

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部