一种低功耗的多端口寄存器文件结构设计

A low power design structure for multi-port register file

下载PDF

导出

摘要为了降低寄存器功耗而不损失处理器性能,提出一种基于读写队列的多体寄存器文件结构(multi-bank register file,MBRF)。该结构使用多个寄存器体来分担多端口的访问压力,并且为每个寄存器体设置相应的读写队列;通过指令分解将读写操作缓存在队列中,从而消除多体结构潜在的访问冲突;采用组合和旁路2种分配策略,减少缓冲队列的长度和对寄存器的读写请求。该结构在一个四发射的超标量模拟器上进行评估。研究结果表明:整个寄存器文件最终节省了52%的功耗,而处理器的IPC损失仅为1.6%。与其他寄存器文件相比,基于读写队列的MBRF结构在多发射处理器应用中具有明显的优势。 To reduce the power of register file without bringing processor performance loss, a novel multi-bank register file（MBRF） based on read and write queue was presented. Several register banks were employed to partake the register file access pressure on multiple ports, a read queue and a write queue were organized for each register bank. The read and write operations were decomposed from each instruction into two queues to avoid the potential access conflicts. In addition, both combining and forwarding dispatch strategies were used to reduce the queue length, as well as read and write requires for registers. The design structure was evaluated on a four-issue superscalar simulator. The results show that the total power of register file is reduced by 52% while the processor IPC loss is just no more than 1.6%. Compared with other register files, the MBRF based on read and write queue takes on an evident advantage in multi-issue processor.

作者肖建青娄冕张洵颖沈绪榜

机构地区西安微电子技术研究所集成电路设计部

出处《中南大学学报（自然科学版）》 EI CAS CSCD 北大核心 2015年第8期2914-2922,共9页 Journal of Central South University:Science and Technology

基金国家高技术研究发展计划(863计划)项目(2011AA120204) 国防科工局"十二五"民用航天预研项目(YY2011-012(D020201))~~

关键词多发射多体寄存器文件读写队列访问冲突指令分解 multi-issue multi-bank register file read and write queue access conflict instruction decomposition

分类号 TP302.2 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献16

1Capalija D, Abdelrahman T S. Microarchitecture of a coarse-grain out-of-order superscalar processor[J]. IEEE Transacttions on Parallel and Distributed Systems, 2013, 24(2): 392-405.
2Gepner P, Gamayunov V, Fraser D L. The 2nd generation Intel core processor: Architectural features supporting HPC[C]//The 10th International Symposium on Parallel and Distributed Computing. New York: IEEE, 2011: 17-24.
3Aggarwal A, Franklin M. Energy efficient asymmetrically ported register files[C]//The 21 st International Conference on Computer Design. New York: IEEE, 2003: 2-7.
4Hironaka T, Maeda M, Tanigawa K, et al. Superscalar processor with multi-bank register file[C]//The Innovative Architecture for Future Generation High-Performance Processors and Systems. New York: IEEE, 2005: 3-12.
5Tseng J H, Asanovic K. A speculative control scheme for an energy-efficient banked register file[J]. IEEE Transactions on Computers, 2005, 54(6): 741-751.
6Preston R P, Badeau R W, Bailey D W, et al. Design of an 8-wide superscalar R/SC microprocessor with simultaneous multithreading[C]//The IEEE International Solid-State Circuits Conference. New York: IEEE, 2002: 266-334.
7Sangireddy R. Instruction format based selective execution for register port complexity reduction in high-performance processors[C]//The Third International Conference on Information Technology: New Generations. New York: IEEE, 2006: 227-232.
8Balkan D, Sharkey J, Ponomarev D, et al. Selective writeback: Reducing register file pressure and energy consumption[J]. IEEE Transactions on Very Large Scale Integration Systems, 2008, 16(6): 650-661.
9ZHANG Yanjun, HE Hu, SUN Yihe. A new register file access architecture for software pipelining in VLIW processors[C]//The 2005 Asia and South Pacific Design Automation Conference. New York: IEEE, 2005: 627-630.
10Kessler R E. The alpha 21264 microprocessor[J]. IEEE Micro, 1999, 19(2): 24-36.

二级参考文献15

1Wei-WuHu Fu-XinZhang Zu-SongLi.Microarchitecture of the Godson-2 Processor[J].Journal of Computer Science & Technology,2005,20(2):243-249. 被引量：52
2Cootes T F, Taylor C J, Cooper D H et al. Active shape models-their training and application. Computer Vision and Image Understanding, 1995, 61(1): 38-59
3Kuzmanov G, Vassiliadis S, Eijndhoven J. A 2D addressing mode for multimedia applications//Proceedings of the Workshop on System Architecture, Modeling, and Simulation (SAMOS 2001). Samos, Greece, 2002: 291-306
4Budnik P, Kuck D J. The organization and use of parallel memories. IEEE Transactions on Computers, 1971, 20(12) : 1566-1569
5Chen S, Postula A, Jozwiak L. Synthesis of XOR storage schemes with different cost for minimization of memory contention//Proceedings of the Euromicro Conference. Milan, Italy, 1999:1170-1177
6Lee H, Moon K A, Park J W. Design of parallel processing system for facial image retrieval//Proceedings of the 4th International ACPC Conference. Salzburg, Austria, 1999: 592-593
7Lee H, Park J W. Parallel processing system for multi-access memory system//Proceedings of the World Multi-Conference on Systematics, Cybernetics, and Information. 2000: 561- 565
8Kim K, Prasanna V K. Latin squares for parallel array access. IEEE Transactions on Parallel and Distributed Systems, 1993, 4(4): 361-370
9Lee D. Scrambled storage for parallel memory systems//Proceedings of the IEEE International Symposium on Computer Architecture Honolulu. Hawaii, 1988:232-239
10Park J W. An efficient buffer memory system for subarray access. IEEE Transactions on Parallel and Distributed Systems, 2001, 12(3): 316-335

共引文献5

1史莉雯,樊晓桠,陈杰,黄小平,郑乔石.程序行为分析指导TLB低功耗设计[J].计算机科学,2011,38(5):301-304. 被引量：1
2杨澜,赵祥模,惠飞,史昕,张建阳.基于FPGA的双轴倾角计信号提取方法研究[J].计算机应用与软件,2012,29(4):89-93.
3范灵俊,唐士斌,张轮凯,郑亚松,张浩.一种带有无效缓存路访问过滤机制的低功耗高速缓存[J].小型微型计算机系统,2012,33(10):2231-2236.
4范灵俊,徐远超,施巍松,范东睿,娄杰.针对组相联缓存的无效缓存路访问混合过滤机制研究[J].计算机学报,2013,36(4):799-808. 被引量：2
5王聪,程耀东,阚文枭,杜然,徐琪,陈刚.多核平台上的BESⅢ离线数据并行处理[J].计算机应用,2015,35(A02):41-43.

1何芳成,韩晓彤,张珊,时浩然,周晓军.基于EDA技术的流水线CPU创新设计[J].科技创新与应用,2017,7(13):32-33. 被引量：2
2沈胜宇,李思昆.基于指令分解的微处理器验证与RTL级错误定位[J].计算机工程与科学,2005,27(5):97-100.
3邢帆.看国美如何大象起舞[J].中国信息化,2012(21):48-49.
4蔡报勤,尧时茂,蔡香林.基于NAT技术的WEB服务器负载分担实现[J].九江学院学报（自然科学版）,2006,21(3):14-16. 被引量：1
5刘毅.缓存技术在无线网络中的应用分析[J].通讯世界,2016,22(12):67-67. 被引量：1
6金融数据存储系统[J].新金融世界,2009(3):60-60.
7朱静,刘心松,王征.一种改进的主副本分布式锁[J].成都信息工程学院学报,2007,22(5):598-601.
8闫浩,刘岩,华斯亮,王东辉,侯朝焕.A low-power multi port register file design using a low-swing strategy[J].Journal of Semiconductors,2012,33(3):101-108.
9王如跃,雷电.基于XML的业务指令分解系统研究与设计[J].计算机系统应用,2010,19(4):89-93.
10蒙晓净.构建弹性可扩展的数据库集群[J].程序员,2013(8):103-105.

中南大学学报（自然科学版）

2015年第8期

浏览历史

内容加载中请稍等...

一种低功耗的多端口寄存器文件结构设计

参考文献16

二级参考文献15

共引文献5

相关作者

相关机构

相关主题

浏览历史