期刊文献+

基于间隔译码的处理器瞬时故障检测 被引量:2

Detecting Transient Fault in Processors Via Slack-Decode
下载PDF
导出
摘要 同时冗余线程(SRT)重复的取指过程加剧了I-Cache和取指部件的负担,冗余线程间的取指冲突还会严重影响处理器的整体性能。在深入分析SRT故障检测机理和性能瓶颈的基础上,提出间隔译码的同时冗余线程(SD-SRT),特点如下:用统一取指队列和间隔译码技术取代间隔取指技术,在保持容错能力的前提下明确消除了冗余的取指过程,有效缓解了取指部件和I-Cache的压力,进而提高了整体性能;用统一的寄存器分配和验证机制,取代复杂的RC-Buf结构。实验表明,SD-SST比SRT性能提高20%以上,取指次数降低43%。此外,SD-SRT还显著降低了硬件设计的复杂度。 Simultaneously and Redundantly Threaded (SRT) demands a replicated fetch for almost all instructions, which not only increases pressures on I-Cache and fetch component, but also degrades the overall performance due to inter-thread conflicts. Having made an in-depth investigation of the mechanism and performance bottleneck of SRT, this paper propose Slack-Decode Simultaneously and Redundantly Threaded architecture (SD-SRT), which features the following improvements. First, a unified instruction-fetch-queue and a scheme called slack-decode substitute the slack-fetch scheme in SRT, to harmonize proceeding of redundant threads. Compared with SRT, SD-SRT definitely eliminates the redundant instruction fetches while retains the fault-tolerance capability, boosting the overall performance significantly. Second, a unified register renaming and checking scheme substitutes the complex register-check-buffer (RC-Buf). The simulations show SD-SRT outperforms SRT in terms of IPC by over 20% , while decreases I-Cache access by 43%. Meanwhile, SD-SRT decreases the hardware complexities.
出处 《宇航学报》 EI CAS CSCD 北大核心 2006年第6期1328-1334,共7页 Journal of Astronautics
基金 国家"十五"预研项目(41316.1.2) 国家自然科学基金(60503015)
关键词 处理器 瞬时故障 同时冗余线程 可靠性 高性能 Processor Transient fault SRT Rehability High-performance
  • 相关文献

参考文献11

  • 1Rubinfeld P. Managing problems at high speed[J]. IEEE Computer, 1998, 31(1): 47-48.
  • 2J Gaisler J. A portable and fault-tolerant microprocessor based on the SPARC V8 architecture[C]// Proc. of the Int'l Conference on Dependable Systems and Networks, Los Alamitos, USA: IEEE, 2002,409-415.
  • 3Mukherjec S S, Emer J, Reinhardt S K. The soft error problem: an architectural perspective[C]//Proc, of the 11th Int'l Symposium on High-Performance Computer Architecture, Los Alamitos, USA: IEEE 2005,243 - 247.
  • 4Reinhardt S K, Mukherjec S S. Transient fault detection via simultaneous multithreading[C] Proc. of the 27th Annual Int'l Symposium on Computer Architecture, Los Alamitos, USA: IEEE, 2000, 25- 36.
  • 5Rotenberg E. AR-SMT: a microarchitecturul approach to fault tolerance in microprocessors[ C]//Proc. of the 29th Fault-Tolerant Computing Symposium, Los Alamitos, USA: IEEE, 1999, 84-91.
  • 6Vijaykumar T N, Pomeranz K, Cheng K. Transient fault recovery using simultaneous mulfithreading[C]//Proc, of the 29th Annual Int' l Symposium on Computer Architecture, Los Alamitos, USA: IEEE, 2002, 87 - 98.
  • 7Mukherjee S S, Kontz M, Reinhardt S K. Detailed design and evaluation of redundant multithreading alternatives[ C]II Proc. of the 29th Annual Int'l Symposium on Computer Architecture, Los Alamites, USA: IEEE, 2002, 99- 110.
  • 8Gomaa M, Scarbrough C, Vijaykumar T N, Pomeranz I. Transientfault recovery for chip multiprocessors [ J ]. IEEE Micro, 2003,23(6) :76 - 83.
  • 9Tullsen D M, Eggem S J, Levy H M. Simultaneous multithreading:maximizing on-chip parallelism[ C]//Proc. of the 22nd Annual Int'l Symposium on Computer Architecture, New York, USA: ACM,1995, 392 - 403.
  • 10McKee S A. Reflections on the memory wall[C]// Proc. of 2004 Computing Frontiers Conference, New York, USA: ACM, 2004,162- 167.

同被引文献36

  • 1杨光,唐祯敏.高速磁浮列车运行控制系统体系结构研究[J].中国铁道科学,2006,27(6):68-72. 被引量:18
  • 2姚红良,李鹤,李小彭,闻邦椿.旋转机械局部故障力的模型诊断及瞬时故障力识别[J].机械工程学报,2007,43(1):120-124. 被引量:12
  • 3Allan A,Edenfetd D,Joyner Jr W H,etal. 2001 Technology Roadmap for Semiconductors[J]. IEEE Corn puter ,2002,35(1) :42-53.
  • 4Weaver C, Emer J, Mukherjee S,et al. Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor[C]//ISCA 2004. New York: IEEE Press, 2004 : 264-275.
  • 5Ronen R,Mendelson A, Lai K,et al. Coming Challenges in Microarchitecture and Architecture [J]. Proceedings of the IEEE ,2001,89(3) :325-340.
  • 6杨华 崔刚 刘宏伟 等.容错处理器体系结构概述[J].哈尔滨工业大学学报,2006,38:586-590.
  • 7Rotenberg E. AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors [C]// FTCS 29. New York : IEEE Press, 1999 : 84-91.
  • 8Reinhardt S K,Mukherjee S S. Transient Fault Detection Via Simultaneous Multithreading [C]//ISCA 2000. New York:IEEE Press,2000:25-36.
  • 9Vijaykumar T N, Pomeranz K, Cheng K. Transient Fault Recovery Using Simultaneous Multithreading[C]//ISCA 2002. New York: IEEE Press, 2002: 87-98.
  • 10Mukherjee S S,Kontz M, Reinhardt S K. Detailed Design and Evaluation of Redundant Multithreading Alternatives [C]//ISCA 2002. New York: IEEE Press, 2002:99 110.

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部