摘要
检查点算法作为一种有效的故障技术及容错手段,已广泛地运用在网格、分布式和云计算系统中。该文提出了一种非阻塞协调检查点算法,该算法增加了系统的可靠性,并允许检查点灵活设置,充分缩减了同步信息数量,加速了检查点形成时间。和典型的相关算法比较,该文提出的算法使用更少的同步控制消息,具有更低的费用,引入同步控制消息的时间复杂度由一般的O(n2)降到O(n),且同步消息数仅仅为n-1。
The technology of checkpoint as an effective method of fault tolerance has been widely used in grid,distributed and cloud systems.In this paper,a non-blocking cooperative checkpoint algorithm,which increases the reliability of the system and set up checkpoints flexible.At the same time,it fully reduces the synchronization information quantity,speeds up the formation checkpoint time,fully reduced the amount of information synchronized,the checkpoint accelerated development time.When compared to noted recent algorithms,the proposed algorithm uses less synchronous control messages with lower overhead.While the time complexity of control message during synchronous phase is reduced from O(n2) to O(n),the algorithm's controlling messages are reduced to n-1.
作者
党红恩
赵尔平
雒伟群
DANG Hong-en, ZHAO Er-ping, LUO Wei-qun (School of Information Engineering,Tibet Nationalities Institute, Xianyang 712082,China)
出处
《电脑知识与技术》
2014年第4期2394-2396,共3页
Computer Knowledge and Technology
基金
国家民委科研项目(12XZZ002)
西藏自治区自然基金项目(12KJZRYMY07)
关键词
检查点
分布式系统
云计算系统
容错
checkpoint
distributed system
cloud computing systems
fault-tolerant