摘要
检查点技术是保障计算机系统可靠性的一种常用方法.通常假定系统失效的发生服从泊松分布,因此检查点是等间隔设置的.但近几年公布的现场数据表明以上假定不合实际.因此,首先利用现场失效数据对固定检查点间距方法的适应性进行分析,并提出两种检查点动态设置方法,它们根据系统前个阶段的失效信息动态地设置下一个检查点.模拟实验结果表明,所提出的方法在复杂失效分布下所获得的效果达到或者优于最优固定检查点的水平.
Checkpointing is a commonly used technique that protects computer system against failures.Former researches often assumed the failure distribution to be Poisson distribution,and thus derived a fixed interval between checkpoints for the optimal solution.But recently published field failure data shows that the above assumption is not realistic.This paper initially studied the applicability of the fixed interval checkpoint method using the field failure data.Then two dynamic checkpoint placement methods are proposed,which dynamically determine the next checkpoint place using the preview failure information.The experiments showed that these methods have met or surpassed the optimal fixed interval checkpoints method.
出处
《小型微型计算机系统》
CSCD
北大核心
2010年第4期715-721,共7页
Journal of Chinese Computer Systems
基金
国家“九七三”重点基础研究发展计划项目(2005CB321604)资助
国家自然科学基金项目(90207021)资助
关键词
检查点技术
检查点间隔
失效分布未知
动态检查点
checkpointing
checkpoint interval
failure distribution free
dynamic checkpoint placement