期刊文献+

Optimizing checkpoint for scientific simulations

Optimizing checkpoint for scientific simulation
原文传递
导出
摘要 It is extremely time-consuming to restart a long-running simulation from the beginning when a failure occurs.Checkpointing is a viable solution that enables simulations to be resumed from the point of failure.We study three models to determine the optimal checkpoint interval between contiguous checkpoints so that the total execution time is minimized and we demonstrate that optimal checkpointing can facilitate self-optimizing.This study greatly advances our knowledge of and practice in optimizing long-running scientific simulations. It is extremely time-consuming to restart a long-running simulation from the beginning when a failure occurs.Checkpointing is a viable solution that enables simulations to be resumed from the point of failure.We study three models to determine the optimal checkpoint interval between contiguous checkpoints so that the total execution time is minimized and we demonstrate that optimal checkpointing can facilitate self-optimizing.This study greatly advances our knowledge of and practice in optimizing long-running scientific simulations.
出处 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2012年第12期891-900,共10页 浙江大学学报C辑(计算机与电子(英文版)
基金 Project supported by the National Science Foundation of USA the Information Technology Research (ITR/AP-DEB) (No. 0112820)
  • 相关文献

参考文献17

  • 1Cao, T., Vaz Salles, M., Sowell, B., Yue, Y., Demers, A., Ge?hrke, J., White, w., 2011. Fast Checkpoint Recovery Al?gorithms for Frequently Consistent Applications. Proc. ACM SIGMOD Int. Conf. on Management of data, p.265-276. [doi:10.1145/1989323.1989352].
  • 2Chandy, K., 1975. A survey of analytic models for rollback and recovery strategies. Computer, 8(5):40-47. [doi: 1 0.11 09/ C-M.1975.218955].
  • 3Duda, A., 1983. The effects of checkpointing on program execution times. In}: Process. Lett., 16(5):221-229. [doi: 10.1016/0020-0190(83)90093-5].
  • 4Gelenbe, E., Hernandez, M., 1990. Optimum checkpoints with age dependent failures. Acta Inf., 27(6):519-531. [doi: 10.1007/BF00277388].
  • 5Grassi, v., Donatiello, L., Tucci, S., 1992. On the optimal checkpointing of critical task and transaction-oriented systems. IEEE Trans. Software Eng., 18(1):72-77. [doi:10. 1109/32.120317].
  • 6Huang, Y., Madey, G., 2005. Autonomic Web-Based Simula?tions. Proc. 38th Annual Simulation Symp., p.160-l67. [doi: 1 0.11 09/ANSS.2005.15].
  • 7Huang, Y., Xiang, X., Madey, G., 2004. A Self Manageable Infrastructure for Supporting Web-Based Simulations. Proc. 37th Annual Simulation Symp., p.149-156. [doi:10. 1109/SIMSYM.2004.1299478].
  • 8Ji, Y., Jiang, H., Chaudhary, v., 2011. A heuristic checkpoint placement algorithm for adaptive application-level checkpointing. Int. J. Appl. Sci. Technol., 1(6):50-61.
  • 9Kohl, J., Papadopoulas, P., 1998. Efficient and Flexible Fault Tolerance and Migration of Scientific Simulations Using CUMULVS. Proc. SIGMETRICS Symp. on Parallel and Distributed Tools, p.60-71. [doi:10.1145/281035.281042].
  • 10Kulkarni, VG., Nicola, VF., Trivedi, K.S., 1990. Effects of checkpointing and queuing on program performance. Commun. Stat. Stoch. Models, 6(4):615-648. [doi:10. 1080/15326349908807166].

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部