摘要
机群系统中并行程序的执行具有不确定性 ,这种不确定性给并行程序的调试带来了困难 .并行程序的不确定性是由运行环境中的各种干扰因素造成的 .该文研究交互式调试行为对调试程序的干扰特性 。
This paper studies the characteristic of perturbation that a debugger imposes on debugged parallel programs while user debugs a parallel program in interactive mode, which is very difficult and very helpful for the design and implementation of a practical debugger in cluster systems. First of all, several techniques that are used to decrease perturbation are briefly discussed. Then, the message-passing model of parallel programs in cluster systems is presented. The model is different from others, in that Dmax and Dmin, which represent the maximum latency and minimum latency of messages in cluster systems, are introduced respectively. In order to describe the executive character of parallel programs accurately, this paper defines the terms of state-freezing and equivalent execution, and analyzes the detailed conditions of perturbation that a debugger imposed on a parallel program. Finally, the authors give conditions under which the debugger would produce perturbation and formally prove these results. According to the results, two algorithms are designed, which can inform the user of the perturbation that a debugger has imposed on debugged programs in real time. A debugging tool, DENNET, in cluster systems is developed. Those algorithms have been integrated in DENNET and the corresponding debugging mode has been named pure mode. When debugging a parallel program, users can choose pure mode or not. Acknowledge time and latency are two key parameters in those algorithms. Finally, the testing results of these two parameters are given.
出处
《计算机学报》
EI
CSCD
北大核心
2002年第2期122-129,共8页
Chinese Journal of Computers
基金
国家自然科学基金 (6993 3 0 2 0 )资助