摘要
在分布式计算系统中保证并行应用计算的正确性及提高计算系统中动态资源的利用率是一个重要的研究问题。在原有的基于ProActive的并行计算平台上,引入呼吸通信机制、故障节点发现机制和子任务重新调度机制,设计和实现了一个容错调度系统。实验表明该调度器在部分节点出现故障的情况下,能保证并行计算的正确性,并具有较好的性能。
It is an important research issue to ensure the computation correctness for parallel application and enhance the using rate of dynamic computing resource in distributed computing system. Based on the previous ProActive-based parallel computing system, a ProActive-based Fault-tolerant Task-scheduler was developed, which combined the breathe mechanism, fault-discover mechanism and subtask reschedule mechanism. Experiments show that the Fault-tolerant Task-scheduler has good performance and ensures the computation correctness even if when some computing resources fail.
出处
《计算机应用》
CSCD
北大核心
2008年第2期371-373,共3页
journal of Computer Applications
基金
广西教育厅科研项目([2006]26号)
广西大学博士启动基金项目