摘要
针对云服务可靠性较低的问题,提出一种考虑节点失效恢复机制的任务调度模型.该模型引入失效恢复机制分析节点的行为特性,将节点间的交互失效划分为可恢复失效和不可恢复失效.参考社会学的人际关系信任模型,通过量化和评估失效恢复机制下节点的可信程度,建立更加符合实际的云服务可靠性模型,并允许资源节点自行调节失效恢复次数限制和失效恢复率.将节点的可信度并入DLS算法得到考虑失效恢复机制的动态级调度(FR-DLS)算法.FR-DLS算法在计算调度级别时充分考虑服务资源的可信程度,使应用任务能够被有效地分配到可信资源节点上.为了评估所提出的算法,在PlanetLab环境中设计基于CloudSim的仿真实验平台,分析及仿真实验结果表明:所提出的FR-DLS算法在牺牲较少的任务完成时间和调度长度的前提下,能够有效地提高云环境下执行任务的成功率;当云环境中的资源节点数和应用任务数不断增加时,该算法在可靠性方面所提升的性能远高于其在任务完成时间和调度长度代价方面所提升的性能,充分体现了其在大规模云环境下的实用性.
A task scheduling model with fault recovery mechanism was proposed aimed at the problem of low reliability in Cloud service. The behavior characteristic of nodes was analyzed by using failure recovery mechanism, and the interaction failures between nodes were classified into two categories, including unrecoverable failures and recoverable failures. A more practical Cloud service reliability model was proposed through quantifying and evaluating the trustworthiness of computing nodes by referring to the social trust relationship. The constraints on the numbers of recoveries performed and the recoverability probability could be adjusted freely by resource owners. A dynamic level scheduling (DI.S) algorithm considering fault recovery mechanism named FR-DLS was proposed by integrating the trustworthiness of the nodes into the existing DLS algorithm. The FR-DLS algorithm takes the Cloud service resources' trust degree into account when calculating the scheduling-level of task-resource pairs. Accordingly, the tasks could be executed on trust nodes efficiently. A simulation platform based on CloudSim in PlanetLab was developed in order to evaluate the proposed algorithm. The theoretical analyses and simulation experimental results prove that the proposed FR-DLS algorithm can efficiently improve the mission success rate in cloud environment at the expense of relatively fewer execution time and scheduling length. With the increasing number of nodes and tasks, the increased performance in reliability is much higher than that in the cost of execution time and scheduling length, verifying the practicability in large-scale Cloud environment.
出处
《浙江大学学报(工学版)》
EI
CAS
CSCD
北大核心
2015年第12期2305-2315,共11页
Journal of Zhejiang University:Engineering Science
基金
国家自然科学青年基金项目(61402005)
关键词
云计算
贝叶斯估计
信任关系
可信度
失效恢复机制
Cloud computing
Bayesian estimation
trust relationship
trustworthiness
failure recoverymechanism