一种VLIW循环指令的预取和优化策略

Design and Optimization of a VLIW Loop Instruction Prefetch Structure

下载PDF

导出

摘要本文提出了一种VLIW处理器的预取和针对循环指令的优化策略.文中重点介绍了预取普通指令和处理循环指令的方法,以及普通预取和循环预取这两种预取模式间的切换方式.基于该设计和优化方案,可以有效减小取指操作的功耗.实验证明,在针对不同的应用上,减少的功耗从40%到90%不等,优化了该VLIW多运算簇DSP处理器的性能. In this paper,we present a novel design of a structure for VLIW instruction prefetching and separating loop instruction from the other program.With this method,we can reduce the power consuming in fetching instruction,which range from 40%to 90%for different applications,and improve the performance of the cluster based DSP processor.

作者琚魁谢憬毛志刚

机构地区上海交通大学微电子学院

出处《微电子学与计算机》 CSCD 北大核心 2013年第5期19-22,共4页 Microelectronics & Computer

基金国家"八六三"计划项目(2009AA011705)

关键词 DSP处理器 VLIW SIMD 指令预取 DSP processor VLIW SIMD instruction prefetch

分类号 TN402 [电子电信—微电子学与固体电子学]

引文网络
相关文献

参考文献5

1沈立,戴葵,王志英.以基本块为单位的非顺序指令预取[J].计算机工程与科学,2003,25(4):94-98. 被引量：4
2Zhenqi Wei, Peilin Liu, Ji Kong, et al, Low-power microarchitecture of zero- overhead nested loops in embedded processors [ C]// International Symposium on Intelligent Signal Processing and Communication. China, Chengdu, 2010.
3Yiqiang Ding, Wei Zbang. Loop-based instruction pre- fetching to reduce the worst- case execution time[J].IEEE transactions on computers, 2010,59(6):855-864.
4Jian Wang, Bogong Su, Erh Wen Hu. A scalable loop optimization approach for scalable DSP processors[C] //ICASSP "00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference USA: Salt Lake City, 2000.
5Vladim I'r Guzma, Teemu Pitk"anen, Jarmo Takala. instruction buffer with limited control flow and loop nest support[C]//International Conference on Embed- ded Computer Systems: Architectures, Modeling and Simulation. Lebanon:Beirut, IC-Samos, 2011.

二级参考文献6

1R Colwell, R Nix, J O' Donnell, et al. A VLIW Architecture for a Trace Scheduling Compiler[ A ]. Proc of the 2nd Int' 1 Conf on Architectural Support for Programming Languages and Operating Systems[C]. 1987.180 - 192.
2W Hwu, S Mahlke, W Chen, et al. The Superblock: An Effective Technique for VLIW and Superscalar Compilation[ J]. The Journal of Supercomputing, 1993,7:229 - 248.
3C Xia, J Torrellas. Instruction Prefetching of Systems Codes with Layout Optimized for Reduced Cache Misses[ A] .23rd Annual Int' l Symp on Computer Architecture[ C]. 1996.
4N Jouppi. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully Associative Cache and Prefetch Buffers[ A].Proc of the 17th Annual Int'l Symp on Computer Architecture[ C].1990.
5J Pieroe, T Mudge. Wrong-Path Instruction Prefetching[ A ]. 29th Int'l Symp on Microarchitecture[ C]. 1996.165 - 175.
6D Joseph, D Grunwald. Prefetching Using Markov Predictors[ A ].24th Annual Int'l Symp on Computer Architecture[C]. 1990.

共引文献3

1张骏,梅魁志,赵季中.面向多核结构的自适应选择性指令主动推送技术[J].小型微型计算机系统,2013,34(3):636-643. 被引量：1
2刘松鹤,宋焕生,亓淑敏,李文敏.无污染Cache访问控制技术[J].计算机工程与应用,2013,49(10):5-9.
3方皓,吴礼发,吴志勇.基于符号执行的Return-to-dl-resolve利用代码自动生成方法[J].计算机科学,2019,46(2):127-132. 被引量：7

1杨焱,侯朝焕.VLIW处理器系统级验证平台的设计与实现[J].电子测量与仪器学报,2007,21(2):81-85. 被引量：3
2宋何娟,李洋,张建生.16位微处理器IP核的优化设计[J].仪器仪表用户,2008,15(3):107-109.
3陈志坚,孟建熠,严晓浪,沙子岩.基于神经网络的重构指令预取机制及其可扩展架构[J].电子学报,2012,40(7):1476-1480. 被引量：2
4李罗生,侯朝焕.一个VLIW处理器验证平台的设计[J].微处理机,2004,25(4):46-48. 被引量：2
5梁志坚,胡越黎,冉峰,赵燕.一种高性能低功耗微控制器的设计(英文)[J].微电子学,2005,35(1):32-35.
6罗玲慧.浅谈FANUC 0i-MD系统G76和G87固定循环指令参数设置[J].机电信息,2017(3):82-83. 被引量：1
7谷俊丽,何虎,孙义和.在VLIW处理器上实现视频图像的运动估计算法的方法[J].微电子学与计算机,2010,27(6):5-7. 被引量：1
8杨焱,张凯.VLIW处理器可重组乘法器单元设计[J].微处理机,2007,28(3):21-23.

微电子学与计算机

2013年第5期

浏览历史

内容加载中请稍等...

一种VLIW循环指令的预取和优化策略

参考文献5

二级参考文献6

共引文献3

相关作者

相关机构

相关主题

浏览历史