In most of fault detection algorithms of distributed system, fault model is restricted to fault of process, and link failure is simply masked, or modeled by process failure. Both methods can soon use up system resourc...In most of fault detection algorithms of distributed system, fault model is restricted to fault of process, and link failure is simply masked, or modeled by process failure. Both methods can soon use up system resource and potentially reduce the availability of system. A fault Detection Protocol based on Heartbeat of multiple Master-nodes (DPHM) is proposed, which can immediately and accurately detect and locate faulty links by adopting voting and electing mechanism among master-nodes. Thus, DPHM can effectively improve availability of system. In addition, in contrast with other detection protocols, DPHM reduces greatly the detection cost due to the structure of master-nodes.展开更多
基金the National Natural Science Foundation of China (No.60503015).
文摘In most of fault detection algorithms of distributed system, fault model is restricted to fault of process, and link failure is simply masked, or modeled by process failure. Both methods can soon use up system resource and potentially reduce the availability of system. A fault Detection Protocol based on Heartbeat of multiple Master-nodes (DPHM) is proposed, which can immediately and accurately detect and locate faulty links by adopting voting and electing mechanism among master-nodes. Thus, DPHM can effectively improve availability of system. In addition, in contrast with other detection protocols, DPHM reduces greatly the detection cost due to the structure of master-nodes.