期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Fault-Tolerant Mechanism of the Distributed Cluster Computers 被引量:1
1
作者 尚毅梓 靳洋 吴保生 《Tsinghua Science and Technology》 SCIE EI CAS 2007年第S1期186-191,共6页
The distributed system with high performance and stability is commonly adopted in large scale scientific and engineering computing. In this paper, we discuss a fault-tolerant mechanism under Linux circumstance to impr... The distributed system with high performance and stability is commonly adopted in large scale scientific and engineering computing. In this paper, we discuss a fault-tolerant mechanism under Linux circumstance to improve the fault-tolerant ability of the system, namely a scheme and frame to form the stable computing platform. In terms of the structure and function of the distributed system, active list and file invocation strategies are employed in the task management. System multilevel fault-tolerance can be achieved by repeated processes in a single node and task migration on multi-nodes. Manager node agent introduced in this paper administrates the nodes using the list, disposes of the tasks according to the nodes’ performance, and hence, to be able to make full use of the cluster resources. An evaluation method is proposed to appraise the performance. The analyzed results show the usefulness of the scheme proposed except for some additional overhead of memory consumption. 展开更多
关键词 distributed system active list file invocation multilevel fault-tolerance
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部