期刊文献+

异构系统硬件故障传播行为分析及容错优化 被引量:3

Propagation Behavior Analysis and Fault Tolerance Optimization of Hardware Fault in Heterogeneous Systems
下载PDF
导出
摘要 以异构系统的过程间相关性分析为基础,研究分析异构系统硬件故障在软件之中的传播行为,指导优化基于异构系统的应用级checkpointing检查点保存问题,并通过实验验证其可行性及性能,对异构系统的容错优化研究具有重大意义. Based on the inter-procedural dependence analysis, this paper studies the propagation behavior in software of hardware fault in heterogeneous systems. This research can be used for optimizing application-level checkpointing techniques. Experimental results demonstrate that this method is viable and can be very helpful for the research of fault tolerance optimization of heterogeneous systems.
作者 贾佳 杨学军
出处 《软件学报》 EI CSCD 北大核心 2011年第12期2853-2865,共13页 Journal of Software
基金 国家自然科学基金(60921062 61003087)
关键词 通用GPU 异构系统 过程间相关性分析 传播行为 容错优化 general purpose GPU heterogeneous system inter-procedural dependence analysis propagation behavior fault tolerance optimization
  • 相关文献

参考文献22

  • 1Luebke D, Harris M, Kriiger J, Purcell T, Govindaraju N, Buck I, Woolley C, Lefohn A. GPGPU: General purpose computation on graphics hardware. In: Proc. of the ACM SIGGRAPH 2004 Course Notes. Los Angeles: ACM Press, 2004. 33. [doi: 10.1145/ 1103900.1103933].
  • 2Fan Z, Qiu F, Kaufman A, Yoakum-Stover S. GPU duster for high performance computing. In: Proc. of the 2004 ACM/IEEE Conf. on Supereomputing (SC 2004). Washington: IEEE Computer Society, 2004.47. [doi: 10.1109/SC.2004.26].
  • 3Dally WJ, Hanrahan P, Erez M, Knight TJ. Merrimac: Supercomputing with streams. In: Proc. of the Supercomputing Conf. (SC 2003). 2003.35-42. [doi: 10.1109/SC.2003.10043].
  • 4TOP500 supercomputing site. http://www.top500.org.
  • 5Burke MG, Cytron RK. Interprocedural dependence analysis and parallelization. In: Proc. of the 20 Years of the ACM/SIGPLAN Conf. on Programming Language Design and Implementation (1979-1999): A Selection. IBM Research Report RC11794. 2003. [doi: 10.1145/13310.13328].
  • 6Ramkumar B, Strumpen V. Portable checkpointing for heterogeneous architectures. In: Proc. of the 27th Int'l Symp. on Fault- Tolerant Computing (FTCS'97). Washington: IEEE Computer Society, 1997.58-67. [doi: 10.1109/FTCS.1997.614078].
  • 7Beguelin A, Seligman E, Stephan P. Application level fault tolerance in heterogeneous networks of workstations. Journal of Parallel and Distributed Computing, 1997,43(2):147-155. [doi: 10.1006/jpdc.1997.1338].
  • 8Kirk D. NVIDIA CUDA Software and GPU Parallel Computing Architecture. New York: ACM Press, 2007. 103-104. [doi: 10.1145/1296907.1296909].
  • 9Kapasi UJ, Rixner S, Dally WJ, Khailany B, Ahn JH, Mattson P, Owens JD. Programmable stream processors. IEEE Computer, 2003,36(8):54-62. [doi: 10.1109/MC.2003.1220582].
  • 10Advanced Micro Devices, Inc. AMD brook+, http://ati.amd.com/technology/streamcomputing/AMDBrookplus.pdf.

同被引文献37

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部