Checkpoint Management with Double Modular Redundancy Based on the Probability of Task Completion

Checkpoint Management with Double Modular Redundancy Based on the Probability of Task Completion

导出

摘要 This paper proposes a checkpoint rollback strategy for real-time systems with double modular redundancy.Without built-in fault-detection and spare processors,our scheme is able to recover from both transient and permanent faults.Two comparisons are conducted at each checkpoint.First,the states stored in two consecutive checkpoints of one processor are compared for checking integrity of the processor.The states of two processors are also compared for detecting faults and the system rolls back to the previous checkpoint whenever required by logic of the proposed scheme.A Markov model is induced by the fault recovery scheme and analyzed to provide the probability of task completion within its deadline.The optimal number of checkpoints is selected so as to maximize the probability of task completion. This paper proposes a checkpoint rollback strategy for real-time systems with double modular redundancy.Without built-in fault-detection and spare processors,our scheme is able to recover from both transient and permanent faults.Two comparisons are conducted at each checkpoint.First,the states stored in two consecutive checkpoints of one processor are compared for checking integrity of the processor.The states of two processors are also compared for detecting faults and the system rolls back to the previous checkpoint whenever required by logic of the proposed scheme.A Markov model is induced by the fault recovery scheme and analyzed to provide the probability of task completion within its deadline.The optimal number of checkpoints is selected so as to maximize the probability of task completion.

作者 Seong Woo Kwak Kwan-Ho You Jung-Min Yang

机构地区 Department of Electronic Engineering School of Information & Communication Engineering Department of Electrical Engineering

出处《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第2期273-280,共8页 计算机科学技术学报（英文版）

关键词 checkpoint scheme double modular redundancy （DMR） real-time task fault tolerance Markov model checkpoint scheme,double modular redundancy （DMR）,real-time task,fault tolerance,Markov model

分类号 TP332 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献15

1Young J W. A first order approximation to the optimal checkpoint intervals. Commun. the ACM, 1974, 17(9): 530-53l.
2Naruse K, Umemura S, Nakagawa, S. Optimal checkpointing interval for two-level recovery schemes. Computers and Mathematics with Applications, 2006, 51(2): 371-376.
3Ziv A, Bruck J. Performance optimization of checkpointing schemes with task duplication. IEEE Transactions on Computers, 1997, 46(12): 1381-1386.
4Nakagawa S, Fukumoto S, Ishii N. Optimal checkpointing intervals for a double modular redundancy with signatures. Comput. and Math. with Applicat., 2003, 46(7): 1089-1094.
5Krishina C M, Shin K G. Real-Time Systems. McGraw-Hill, 1997.
6Pradhan D K, Vaidya N H. Roll-forward checkpointing scheme: A novel fault-tolerant architecture. IEEE Transactions on Computers, 1994, 43(10): 1163-1174.
7Ziv A, Bruck J. Analysis of checkpointing schemes with task duplication. IEEE Trans. Computers, 1998, 47(2): 222-227.
8Pradhan D K, Vaidya N H. Roll-forward and rollback recovery: Performance-reliability trade-off. IEEE Transactions on Computers, 1997, 46(3): 372-378.
9Tiwari A, Tomko K A. Enhanced reliability of finite-state machines in FPGA through efficient fault detection and correction. IEEE Transactions on Reliability, 2005, 54(3): 459-467.
10Yang J M, Kwak S W. A checkpoint scheme with task duplication considering transient and permanent fault. In Proc. IEEE Int. Conf. Industrial Engineering and Engineering Management (IEEM2010) , Dec. 2010, pp.606-610.

1李洋.企业数据备份策略谈[J].网管员世界,2011(14):75-77.
2陈平.SQl Server数据库损坏后的恢复方法[J].山西科技,2002(5):34-35.
3刘彬.系统数据灾难恢复关键技术浅析及对策[J].电子世界,2016,0(18):114-114. 被引量：1
4贾君君,季昕,杨扬.信息安全等级保护背景下的大型企业电子邮件系统建设[J].信息系统工程,2014(5):59-59. 被引量：1
5赵崇.故障内存导致网页访问故障[J].网管员世界,2011(24):90-90.
6实用第一智慧密集[J].电脑编程技巧与维护,2013(23):92-92.
7周春英,陈华钧,彭志鹏,倪渊,谢国彤.Ontology-Driven Mashup Auto-Completion on a Data API Network[J].Tsinghua Science and Technology,2010,15(6):657-667. 被引量：3
8王永涛,虞闯.嵌入式LCD裸机驱动的设计与实现[J].沈阳理工大学学报,2016,35(6):88-91. 被引量：2
9英特尔至强7500瞄准金融行业[J].中国金融电脑,2010(11):85-85.
10段战钦,彭楚武,袁峰.多线程在IOCP服务器测试中的研究和应用[J].计算机系统应用,2009,18(4):184-186. 被引量：2

Journal of Computer Science & Technology

2012年第2期

浏览历史

内容加载中请稍等...

Checkpoint Management with Double Modular Redundancy Based on the Probability of Task Completion

参考文献15

相关作者

相关机构

相关主题

浏览历史