摘要
同时冗余线程(SRT)重复的取指过程加剧了I-Cache和取指部件的负担,冗余线程间的取指冲突还会严重影响处理器的整体性能。在深入分析SRT故障检测机理和性能瓶颈的基础上,提出间隔译码的同时冗余线程(SD-SRT),特点如下:用统一取指队列和间隔译码技术取代间隔取指技术,在保持容错能力的前提下明确消除了冗余的取指过程,有效缓解了取指部件和I-Cache的压力,进而提高了整体性能;用统一的寄存器分配和验证机制,取代复杂的RC-Buf结构。实验表明,SD-SST比SRT性能提高20%以上,取指次数降低43%。此外,SD-SRT还显著降低了硬件设计的复杂度。
Simultaneously and Redundantly Threaded (SRT) demands a replicated fetch for almost all instructions, which not only increases pressures on I-Cache and fetch component, but also degrades the overall performance due to inter-thread conflicts. Having made an in-depth investigation of the mechanism and performance bottleneck of SRT, this paper propose Slack-Decode Simultaneously and Redundantly Threaded architecture (SD-SRT), which features the following improvements. First, a unified instruction-fetch-queue and a scheme called slack-decode substitute the slack-fetch scheme in SRT, to harmonize proceeding of redundant threads. Compared with SRT, SD-SRT definitely eliminates the redundant instruction fetches while retains the fault-tolerance capability, boosting the overall performance significantly. Second, a unified register renaming and checking scheme substitutes the complex register-check-buffer (RC-Buf). The simulations show SD-SRT outperforms SRT in terms of IPC by over 20% , while decreases I-Cache access by 43%. Meanwhile, SD-SRT decreases the hardware complexities.
出处
《宇航学报》
EI
CAS
CSCD
北大核心
2006年第6期1328-1334,共7页
Journal of Astronautics
基金
国家"十五"预研项目(41316.1.2)
国家自然科学基金(60503015)