摘要
针对需要较长执行时间和占用较多计算机资源的"大任务",其节点可靠性随着时间的增加而呈指数衰减的问题,给出了一种考虑节点失效恢复能力的网格服务可靠性模型.在星型网格系统的网格服务可靠性分析中引入了节点失效恢复机制,并考虑了节点软件可靠性对网格服务可靠性的影响,同时采用子任务并行处理和子任务冗余方法提高服务的可靠性.仿真结果验证了引入节点失效恢复机制对提高网格服务可靠性的积极影响,为解决"大任务"可靠性偏低的问题提供了一种有效的解决方法.
A reliability model for grid services that considers the fault recovery is presented to solve the problem that the grid service reliability decreases at exponential as the time increases, especially for some large subtasks that need long-lived computations and long-terms data storage. The ability of recovery into the grid nodes is introduced in grid systems with star topology, and the influence of software reliability is also taken into account. In order to improve the reliability of grid services, the grid services are divided into subtasks and then are assigned to different resources for processing in the proposed model. Numerical example is given to show that the recovery has a positive influence on grid service reliability and provides an effective solution to the fault tolerant for the services which consist of some large subtasks.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2008年第6期693-697,790,共6页
Journal of Xi'an Jiaotong University
基金
国家自然科学基金资助项目(59685003)
高等学校全国优秀博士学位论文专项基金资助项目(200232)
关键词
网格
服务可靠性
节点恢复
grid
service reliability
node recovery