期刊文献+

Enhancing Reliability via Checkpointing in Cloud Computing Systems 被引量:4

Enhancing Reliability via Checkpointing in Cloud Computing Systems
下载PDF
导出
摘要 Cloud computing is becoming an important solution for providing scalable computing resources via Internet. Because there are tens of thousands of nodes in data center, the probability of server failures is nontrivial. Therefore, it is a critical challenge to guarantee the service reliability. Fault-tolerance strategies, such as checkpoint, are commonly employed. Because of the failure of the edge switches, the checkpoint image may become inaccessible. Therefore, current checkpoint-based fault tolerance method cannot achieve the best effect. In this paper, we propose an optimal checkpoint method with edge switch failure-aware. The edge switch failure-aware checkpoint method includes two algorithms. The first algorithm employs the data center topology and communication characteristic for checkpoint image storage server selection. The second algorithm employs the checkpoint image storage characteristic as well as the data center topology to select the recovery server. Simulation experiments are performed to demonstrate the effectiveness of the proposed method. Cloud computing is becoming an important solution for providing scalable computing resources via Internet. Because there are tens of thousands of nodes in data center, the probability of server failures is nontrivial. Therefore, it is a critical challenge to guarantee the service reliability. Fault-tolerance strategies, such as checkpoint, are commonly employed. Because of the failure of the edge switches, the checkpoint image may become inaccessible. Therefore, current checkpoint-based fault tolerance method cannot achieve the best effect. In this paper, we propose an optimal checkpoint method with edge switch failure-aware. The edge switch failure-aware checkpoint method includes two algorithms. The first algorithm employs the data center topology and communication characteristic for checkpoint image storage server selection. The second algorithm employs the checkpoint image storage characteristic as well as the data center topology to select the recovery server. Simulation experiments are performed to demonstrate the effectiveness of the proposed method.
出处 《China Communications》 SCIE CSCD 2017年第7期108-117,共10页 中国通信(英文版)
基金 supported by Beijing Natural Science Foundation (4174100) NSFC(61602054) the Fundamental Research Funds for the Central Universities
关键词 高可靠性 计算系统 检查点 存储服务器 Internet 开关故障 数据中心 存储特性 cloud computing cloud service reliability fault tolerance data center network
  • 相关文献

同被引文献12

引证文献4

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部