期刊文献+

寄存器堆互连的VLIW结构及其指令调度算法

Instruction Scheduling Algorithm for Register File Connectivity Clustered VLIW Architecture
下载PDF
导出
摘要 超长指令字(Very Long Instruction Word,VLIW)处理器一般采用总线互连的多簇结构,每个簇中的功能单元共享一个本地寄存器堆,簇间采用总线传输数据,以避免功能单元增多时,全连通结构的延时、面积和功耗的快速增长;但簇间数据共享时的拷贝和延时,使得处理器在性能上有所下降.文中提出了一种寄存器堆互连的多簇VLIW结构,采用寄存器堆来连接各个簇,从而可以避免簇间数据传输的延时和额外的数据拷贝操作.同时也提出了针对这种结构的指令调度算法,以提高指令调度的性能.实验结果表明,与全连通的VLIW结构相比,寄存器堆互连结构在性能上仅有13%左右的性能下降,代码长度则基本不变;这都优于总线互连的多簇结构. Generally VLIW(Very Long Instruction Word) processors are implemented as busconnectivity clustered architecture, in which the function units in a cluster only access the corresponding local registers and different clusters are connected by buses. This architecture can avoid aggressive growing of delay, area and power in full-connectivity VLIW processors when function units increase. However, performance degradation is induced by its copy operations and latency of communications between clusters. This paper presents a new clustered architecture, in which a register file is used to connect all the clusters so as to turn copy and latency away. This paper also gives instruction scheduling algorithm to improve the performance. The experimental results in- dicate that this new architecture under the help of this scheduling algorithm shows only 13% performance degradation and little code size increase in average compared with those of fully tivity VLIW architecture, which prevails that of bus-connectivity clustered VLIW archite connec cture.
出处 《计算机学报》 EI CSCD 北大核心 2008年第1期127-132,共6页 Chinese Journal of Computers
基金 国家自然科学基金(60236020)资助
关键词 超长指令字 指令调度 寄存器堆 VLIW instruction scheduling register file
  • 相关文献

参考文献9

  • 1Joseph A. Fisher, Paolo Faraboschi and Cliff Young. Embedded computing. California: Morgan Kaufmann, 2005
  • 2Texas Instrument Inc. TMS320C62x/67x CPU and instruction set reference guide. 1998
  • 3Fridman J, Greefield Z. The TigerSharc DSP architecture. IEEE Micro, 2000, 20(1): 66-76
  • 4Zhang Yan Jun, He Hu, Sun Yi-He. A new register file access architecture for software pipelining in VLIW processors// Proceedings of the ASP-DAC. Shanghai, 2005, 1:627-630
  • 5Faraboschi P, Finsher J A, Young C. Instruction scheduling for instruction level parallel processors. Proceedings of the IEEE, 2001, 89(11): 1638-1659
  • 6Ellis J R. Bulldog: A Compiler for VLIW Architectures. London: The MIT Press, 1986
  • 7Capitanio A, Dutt N, Nicolau A. Partitioned register files for VLIWs: A preliminary analysis of tradeoffs//Proceedings of the 25th International Symposium on Microarehiteeture, 1992: 292-300
  • 8Ozer E, Banerjia S, Conte T M. Unified assign and schedule: A new approach to scheduling for clustered register file microarchitectures//Proceedings of the 31st International Symposium on Microarchitecture. Dallas, TX, 1998:308-315
  • 9Rixner S, Dally W J, Khailany Bet al. Register organization for media processing//Proceedings of the 6th International Symposium on High-Performance Computer Architecture. Touluse, 2000:375-286

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部