摘要
本文提出了在总线型局部网络结构的分布式系统中,针对单节点机发生故障的情况下,以进程间的通讯关系来设置断.点的方法,建立相应的进程通讯关系表,从而提出一种新颖的系统恢复算法。该算法的主要思想是:若某节点机发生故障,则该节点机上的进程为坏进程故要根据其它节点机上相关进程的通讯关系表,得出回卷的位置,以使进程通讯关系表恢复到一致状态。在本算法中,回卷时所要作废的通讯总次数最大为U—Uq1+1,它的算法复杂度最大为:O(m2)。
This paper gives a new system recovery algorithm if checkpoint is set by communication relations between processes and make corresponding process communication relation table in the case of what only single processor makes fault in the distributed system of local-net bus-organized structure. The main idea of this algorithm is, when one processor makes fault, according to the communication table of other processes, get the roll -back point and in order that all of the process within a corresponding job will be recovered to a consistent status. In this algorithm, the maximum communication total count to be deleted is U-uq1+1, most complexity of the algorithm is O(m2).
出处
《小型微型计算机系统》
CSCD
北大核心
1994年第2期45-50,共6页
Journal of Chinese Computer Systems
关键词
分布式系统
恢复
分布式计算机
Distributed system, Checkpoint, Communication relation table, Consistency, Recovery