期刊文献+

汇聚路径指令级重用性能优化研究 被引量:1

Convergence path instruction-level reuse high performance optimization mechanism
下载PDF
导出
摘要 针对特定的单边吊床结构,根据C语言编译后生成的代码特征对控制无关Y-行为的特例——误预测指令流重新精确汇聚到正确路径进行动态检测,并利用指令重用降低分支误预测代价,对处理器性能进行优化.实现跨基本块的控制无关Y-行为动态检测,并将相关信息保存到处理器前端汇聚点表中;通过寄存器集成与存储器集成保证正确的相关性;为实现汇聚路径指令级重用,将指令分为可信与不可信两种;可信指令直接提交,不可信指令被插入到恢复缓冲中重新执行.避免了处理器清空流水线和取指重定向,降低了误预测代价.实验表明,对于不同测试基准处理器性能均有所提升,而且随着流水线的加长该机制更加有效. To optimize processor performance, the control independence Y-behavior that the wrong path instruction stream converges again to the correct path, were dynamically detected. And the corresponding branches were saved in a convergence table. Register integration and memory integration were adopted to keep the right dependency. Instructions on convergence path were divided into trustworthy and non-trustworthy categories, which were separately reused and inserted into the recovery buffer. Then the non-trustworthy instructions were reissued from recovery buffer and re-executed. Thus the branch misprediction penalty was reduced, which optimized processor performance. Experimental results verify the feasibility of this performance optimization mechanism across all benchmarks, and it will be more effective for deeper pipeline processors.
出处 《哈尔滨工业大学学报》 EI CAS CSCD 北大核心 2008年第1期81-84,共4页 Journal of Harbin Institute of Technology
关键词 控制无关Y-行为 精确汇聚 错路指令重用 control-independence Y-behavior exact convergence squash reuse
  • 相关文献

参考文献10

  • 1傅忠传 陈红松 王彦 等.处理器Y-行为与Y-分支研究.哈尔滨工业大学学报,2006,38:591-596.
  • 2SODONI A, SOHI G S. Dynamic instruction reuse [C]//Proceedings of the 24th International Symposium on Computer Architecture. USA: Multiscalar People, 1997 : 194 - 205.
  • 3AMIR R, SOHI G S. Register integration: a simple and efficient implementation of squash reuse[ C ]//Proceedings of the 33rd Annual International Symposium on Microarchitecture. Monterey : [ s. n. ], 2000 : 223 - 234.
  • 4HAITHAM A, SRIKANTH T S, KONRAD L. Recycling Waste: Exploiting Wrong Path Execution to Improve Branch Prediction [ C ]//Proceedings of the 17th Annual International Conference on Supercomputing. Sam Francisco:[s, n. ] , 2003:12 -21.
  • 5ONUR M, HYESOON K, PATT K Y N. Techniques for efficient processing in runahead execution engines [ C ]// Proceeding of the 32nd Annual International Symposium on Computer Architecture. Washington : IEEE Computer Society, 2005 : 370 - 381.
  • 6MUTLU O, KIM H, STARK J, et al. On reusing the resuits of pre-executed instructions in a runahead execution processor [ J ]. Computer Architecture Letters,2005,4(1) :2 -6.
  • 7PILLA M L, NAVAUX P O A, DACOSTA A T, et al. The limits of speculative trace reuse on deeply pipelined processors [ C]//Proceding of the 15th Symposium on Computer Architecture and High Performance Computing. Washington: IEEE Computer Society, 2003 : 36 - 44.
  • 8SAISANTHOSH B, SOHI G. Program demuhiplexing: Data - flow based speculative parallelization of methods in sequential programs [ C ]//Proeeeding of the 33 rd Annual International Symposium on Computer Architeeture. Washington : IEEE Computer Society, 2006 : 302 - 313.
  • 9GARG A, RASHID M W, HUANG M. Slackened memory dependence enforcement: combining opportunistic forwarding with decoupled verification [ C]//ACM SIGARCH Computer Architecture News. Washington: IEEE Computer Society, 2006 : 142 - 154.
  • 10BURGER D, AUSTIN T. The simplescalar tool set Version 2.0 [ R ]. Madison : University of Wisconsin Computer Sciences Technical Report, 1997.

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部