摘要
在同时多线程处理器中,提高取指单元的吞吐率意味着各线程之间的Cache竞争更加激烈,而这种竞争又制约着取指单元吞吐率的提高。本文针对当前超长指令字体系结构的新特点,提出了一种同时提高取指单元和处理器吞吐率的方法。该方法通过尽可能早地作废取指流水线中的无效地址,减少了由无效取指导致的程序Cache冲突,也提高了整个处理器的性能。实验结果表明,该方法使处理器和取指单元的吞吐率均相对提高了12%~23%,而一级程序Cache的失效率则略微增加甚至降低。另外,它还能够减少10%~25%的一级程序Cache读访问,从而降低了处理器的功耗。
In a simuhaneous multithreaded processor, improving the throughput of the instruction fetch unit usually means that there is more drastic cache competition between threads, but this competition limits the throughput reversely. Based on the characteristics of the current VLIW architectures,this paper presents an instruction fetch scheme that improves the throughput of the fetch unit and the whole processor. By canceling the invalid addresses in the instruction fetching pipeline, it decreases those conflicts of program caches caused by invalid instruction fetch. As the experimental results show, this scheme can improve the throughput of the instruction unit and the performance of the whole processor by 12~23% relatively,while the program cache's miss rate increases appreciably, even decreases sometimes. It also reduces the program cache's accesses by 10%~25%, so the power consumption of the whole processor is decreased.
出处
《计算机工程与科学》
CSCD
2007年第6期97-101,共5页
Computer Engineering & Science
基金
国家863计划资助项目(2004AA1Z1040)
国家自然科学基金资助项目(60473079)