This paper presents an optimal checkpoint strategy for fault-tolerance in real-time systems where transient faults occur in Poisson distribution. In our environment, multiple real-time tasks with different deadlines a...This paper presents an optimal checkpoint strategy for fault-tolerance in real-time systems where transient faults occur in Poisson distribution. In our environment, multiple real-time tasks with different deadlines and harmonic periods are scheduled in the system by rate-monotonic algorithm, and checkpoints are inserted at a constant interval in each task. When a fault is detected, the system carries out rollback to the latest checkpoint and re-executes tasks. The maximum number of re-executable checkpoints and an equation to check schedulability are derived, and the optimal number of checkpoints is selected to maximize the probability of completing all the tasks within their deadlines.展开更多
文摘This paper presents an optimal checkpoint strategy for fault-tolerance in real-time systems where transient faults occur in Poisson distribution. In our environment, multiple real-time tasks with different deadlines and harmonic periods are scheduled in the system by rate-monotonic algorithm, and checkpoints are inserted at a constant interval in each task. When a fault is detected, the system carries out rollback to the latest checkpoint and re-executes tasks. The maximum number of re-executable checkpoints and an equation to check schedulability are derived, and the optimal number of checkpoints is selected to maximize the probability of completing all the tasks within their deadlines.