期刊文献+

一种新的优化的检查点间隔的求解模型 被引量:1

A New Computational Model of Optimized Checkpoint Interval
下载PDF
导出
摘要 在具有容错功能的高性能计算环境中 ,由于加入检查点机制会给系统引入额外负载 ,检查点间隔的适当选定能使系统性能优化 .Vaidya的贡献是用他的模型得出的优化的检查点间隔的求解等式独立于检查点潜伏时间 (L )及检查点恢复时间 (R) ,本文介绍了一种新的基于时间分段的模型 NSBM,引入了系统平均利用率这一容错领域更易理解的概念代替 Vaidya模型中的平均负载率并推导出了也是独立于 L及 R的求解等式 .实验结果表明 NSBM的求解模型比 Many applications (sequential or parallel) require large amount of time to complete. Such applications can encounter loss of a significant amount of computation if a failure occurs during the execution. Checkpointing and rollback is a technique used to minimize the loss of computation in an environment subject to failures. Unfortunately because of the employment of checkpoint scheme, an additional checkpoint overhead can be introduced to the system. Too big or too small checkpoint interval maybe degrades the performance of system. Proper determination of checkpoint interval can make system performance optimized. The difficulty is how to determine the checkpoint interval, at which condition the performance of checkpoint scheme is optimal. The optimized checkpoint interval's computational equation that was presented in Vaidya's model is independent of the time of checkpoint latency and checkpoint recovery that the application program spends when it rollbacks after an error occurs, which is his great contribution. This paper introduces a new segment based model, presents mean availability that is easier to be understood in fault tolerant instead of checkpoint mean overhead in Vaidya's model and derives a new equation that is also independent of the time of checkpoint latency and recovery. In the end, we give a group of computation results based on the experiment. In addition we analyze the relation of this two model. The conclusion is that the model of NSBM is more effective than the model of Vaidya in respect of the computation of checkpoint interval.
出处 《小型微型计算机系统》 CSCD 北大核心 2003年第3期448-451,共4页 Journal of Chinese Computer Systems
基金 国家高性能计算基金 (993 13 )的资助
关键词 优化 检查点间隔 求解模型 容错 负载率 利用率 时间分段模型 计算机 fault tolerant checkpoint interval overhead availability
  • 相关文献

参考文献8

  • 1[1]Yong J S. A first order approximation to the optimum checkpoint in terval[J]. Communication of the ACM. Sep.1974.17:530~531
  • 2[2]Shin K, Lin T H . Optimal checkpointing of real-time tasks[J].IEEE Trans. Computers.Nov.,1987.36:1328~1341
  • 3[3]Plank J S,Beck M.Libckpt: transparent checkpointing for parallel programs[J]. IEEE Trans. Par.Distr.Syst. Aug.,1994.5:874~879
  • 4[4]Vaidya N H. Impact of checkpoint latency on overhead of a checkpoin tingschemes[J]. IEEE Transactions on Computer.Aug.,1997. 46(8):942~947
  • 5[5]Ziv A, Bruck J. Analysis of checkpointing schems for multiprocessor systems[R] Tech.Rep.RJ 9593. IBM Almaden Research Center, Nov.,1993
  • 6[6]Plank J S. Efficient checkpointing on MIMD Architectures[D]. PhD Thesis. Dept. of Computer Science, Princeton University, June 1993
  • 7[7]Parzen E. Stochastic process[M]. Holden-Day,San Francisco,CA,1962
  • 8[8]http://www.cs.utk.edu/~plank/plank/papers/cs-97-380 .html

同被引文献8

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部