期刊文献+

并行可视化调试工具中检查点使能的研究与实现

THE RESEACH AND IMPLEMENTATION OF ENABLING CHECKPOINT IN VISUAL PARALLEL DEBUGGING TOOLS
下载PDF
导出
摘要 与串行程序相比,并行程序调试会遇到新的问题。首先并行程序往往需要长时间运行,从而导致并行程序调试是一个尤其费时的过程;其次并行程序调试过程中,某一次调试出现的错误在下次调试的时候不一定出现,给错误跟踪带来了很大困难。本文针对这两个问题,设计和实现了一个中间件系统,在并行调试工具XMPI中使能BLCR检查点系统的。通过该中间件,在使用XMPI调试大型MPI并行程序的时候,减少调试阶段并行程序运行时间,并且可以更好跟踪并行程序错误,提高并行程序开发效率。 Compared with serial programs, parallel programs debugging will face some new problems. Firstly during the parallel program debugging, they are usually needed to run a long time to find the bugs, which is time wasteful, Secondly the problems occurring in a debugging maybe won't happen in the next time, which leads to great difficulty to track the problems. This paper focuses on these two problems, and designs and implements an enabling component which enables the BLCR checkpoint and restart system in the parallel debugging tool XMPI. During debugging MPI parallel programs in XMPI, the debugging time is cut shortly and the faults can be tracked, and the efficient of developing parallel programs is improved.
出处 《计算机应用与软件》 CSCD 北大核心 2005年第10期17-18,20,共3页 Computer Applications and Software
基金 国家自然科学基金项目(No.604030347) 上海市科学技术委员会资助项目(No.03dz15026 03dz15027 035115029)
关键词 BLCR LAM-MPI XMPI使能中间件 检查点 重启 并行程序 调试工具 可视化 程序运行时间 中间件系统 BLCR LAM-MPI XMPI Enabling component Checkpoint Restart
  • 相关文献

参考文献6

  • 1Berkeley Lab Checkpoint/Restart (blcr) page. http://ftg.lbl.gov/twiki/bin/view/P-TG/CheckpointRestart, 2003-11 - 14.
  • 2LAM-MPI home page. http://www.lam-mpi.org/, 2004-9-26.
  • 3XMPI page. http://www.lam-mpi.org/software/xmpi/, 2004-3-14.
  • 4S. Sankaran, J.M. Squyres, B. Barrett, A. Lumsdaine, The LAM/MPI checkpoint/restart framework: system-initiated checkpointing, In: LACSI Symposium, 2003, 10: 4-9.
  • 5J.M. Squyres, B. Barrett, A. Lumsdaine, The system services interface(SSI) to LAM/MPI SSI version 1.0.0, Technical Report TR575, Indiana University, Computer Science Department, 2003,8(4): 3-6.
  • 6S. Sankaran, J.M. Squyres, B. Barrett, A. Lumsdaine, Checkpoint/restart system services interface (SSI) modules for LAM/MPI API version 1.0.0 /SSI version 1.0.0, Technical Report TR578, Indiana University, Computer Science Department, 2003,8(4): 5-9.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部