期刊文献+

基于跳转轨迹的分支目标缓冲研究

Efficient BTB Based on Taken Trace
下载PDF
导出
摘要 现代计算机体系结构受两个方面的困扰:性能和能耗。为降低嵌入式处理器日益增长的功耗,提出基于跳转轨迹的分支目标缓冲结构(TG-BTB)。与传统分支目标缓冲每次提取指令时需要查询分支目标缓冲不同,TG-BTB只在执行轨迹预测为跳转时才查询分支目标缓冲。该结构通过在程序执行过程中动态分析跳转轨迹行为,可以实现只在轨迹跳转时查询分支目标缓冲,从而降低功耗。在动态分析过程中首先提取记录两条跳转分支指令之间的指令间隔,然后将提取的指令间隔存储在TG-BTB中,最后根据存储在TG-BTB中的指令间隔决定是否需要查询BTB。基于基准测试向量进行模型验证和性能测试,实验结果表明TG-BTB降低了81%的BTB查询能耗。 Computer architecture is beset with two opposing things:performance and energy consumption. To reduce the increasing energy consumption of embedded processor, we proposed a taken trace branch target buffer (TG-BTB) which is an energy efficient BTB scheme for embedded processors. Unlike the conventional BTB scheme, which requires lookup BTB every instruction fetch, the TG-BTB need lookup BTB only when the trace is a taken trace. This structure dynami- cally analyzes the trace behavior during program execution, and TG-BTB can achieve lookup BTB per taken trace and re- duce the energy consumption of BTB lookup. In the process of dynamic analyzing, TG-BTB detects the instruction inter- val between two taken instructions firstly, and then stores this value into TG-BTB. Finally, the scheme determines to perform BTB lookup or not according to the instruction interval. The experimental results demonstrate TG-BTB achieves 81% energy consumption reduction compared to the conventional BTB scheme.
作者 熊振亚 林正浩 任浩琪 XIONG Zhen-ya LIN Zheng-hao REN Hao-qi(School of Electronics and Information Engineering,Tongji University,Shanghai 200092,China Microelectronics Center,Tongji University, Shanghai 200092, China)
出处 《计算机科学》 CSCD 北大核心 2017年第3期195-201,214,共8页 Computer Science
关键词 跳转轨迹 指令间隔 分支目标缓冲 能耗 Taken trace, Instruction Interval, BTB, Energy consumption
  • 相关文献

参考文献1

二级参考文献25

  • 1Lee J, Smith A. Branch prediction strategies and branch target buffer design[J]. Computer, 1984, 17(1): 6 22.
  • 2Perleberg C, Smith A. Branch target buffer design and optimization[J]. IEEE Trans on Computers, 1993, 42(4): 396-412.
  • 3Chang P Y, Hao E, Patt Y N. Predicting indirect jumps using a target cache [C] //Proc of the 24th Annual Int Conf on Computer Architecture (ICCA'97). New York: ACM, 1997:274-283.
  • 4Li T, Bhargava R, John L K. Adapting branch target buffer to improve the target predictability of java code [J]. ACM Trans on Architecture and Code Optimization, 2005, 2(2): 109-130.
  • 5Kaeli D R, Emma P G. Branch history table prediction of moving target branches due to subroutine returns [C] //Proc of the 18th Annual Int Syrup on Computer Architecture (ISCA'91). New York: ACM, 1991, 34-42.
  • 6Webb C F. Subroutine call/return stack [J]. IBM Technical Disclosure Bulletin, 1988, 30(11): 221-225.
  • 7Temam O. Investigating optimal local memory performance [C] //Proc of the 8th Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 1998: 218-227.
  • 8Harper D T, Linebarger D A. A dynamic storage scheme for conflict free vector access [C] //Proc of the 16th Annual Int Syrup on Computer Architecture (ISCA'89). New York: ACM, 1989: 72-77.
  • 9Rau B R. Pseudo randomly interleaved memory [C]//Proc of the 18th Annual Int Symp on Computer Architecture (ISCA'91). New York: ACM, 1991:74-83.
  • 10Kuck D J, Stokes R A. The burroughs scientific processor (BSP)[J]. IEEE Trans on Computers, 1982, 31 (5): 363- 376.

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部