期刊文献+

MOSI:一种基于超长指令字处理器的同时多线程微体系结构

MOSI: A SMT Microarchitecture Based on VLIW Processors
下载PDF
导出
摘要 描述了一种基于超长指令字处理器的同时多线程微体系结构———MOSI(MultiOp Splitting Issue,多操作①分离发射).MOSI动态地发射同一多操作内的指令,并通过写回缓冲保证计算结果的写回顺序与编译器的视图一致,从而以较小的代价解决了SMT技术中的关键问题.文中详细描述了写回缓冲的结构及算法,给出了多个线程的硬件模型,最后对硬件支持线程的个数及Cache的组织结构进行了讨论.实验结果表明,基于MOSI结构的双线程处理器能够将吞吐率提高40%. Simultaneous Multi-Threading(SMT) technique has become the hot spot in architecture research because it can effectively improve processors' throughput with relative smaller cost. On the other hand, Very Long Instruction Word(VLIW) is popular in high performance processor design currently. Obviously, applying SMT technique to VLIW processors is profitable, but the remarkable characteristics of these processors, such as lack of hardware dynamic schedule mechanism, make it difficult to implement. This paper presents a SMT microarchitecture based on VLIW processors, named MOSI(MultiOp Splitting Issue). MOSI dynamically issues instructions in the same MultiOp, and introduces write-back buffer that writes results into registers according to the supposed order of compiler, so it tackles the crucial problem in SMT technique with minimal cost. This paper describes the write-back buffer's detail structure and run-time algorithm, and then shows the hardware models of single thread and overall processor. In the end, the organization of Caches and the preferable thread count(hardware supported) are discussed. The experimental result shows that the dual thread processor based on MOSI microarchitecture improves the total throughput by 40%.
出处 《计算机学报》 EI CSCD 北大核心 2006年第3期378-383,共6页 Chinese Journal of Computers
基金 国家自然科学基金(60473079)资助
关键词 同时多线程 超长指令字 多操作 指令发射 写回缓冲 simultaneous multi-threading VLIW MultiOp instruction issue write-back buffer
  • 相关文献

参考文献12

  • 1MarrD.T.,Binns F.,Hill D.L.,Hinton G.,Koufaty D.A.,Miller J.A.,Upton M..Hyper-threading technology architecture and microarchitecture.Intel Technology Journal,2002,6(1):4~15
  • 2Tullsen Dean,Eggers Susan,Levy Henry.Simultaneous multithreading:Maximizing on-chip parallelism.In:Proceedings of the 22nd Annual International Symposium on Computer Ar chitecture,Italy,1996,392~403
  • 3Tullsen Dean,Eggers Susan,Emer Joel,Levy Henry,Lo Jack,Stamm Rebecca.Exploiting choice:Instruction fetch and issue on an implementable simultaneous multithreading processor.In:Proceedings of the 23rd Annual International Symposium on Computer Architecture,Philadelphia,USA,1996,191~202
  • 4Keckler S.W.,Dally W.J..Processor coupling:Integrating compile time and runtime scheduling for parallelism.In:Proceedings of the 19th Annual International Symposium on Computer Architecture,Australia,1992,202~213
  • 5Canal R.,Gonzalez A..A low complexity issue logic.In:Proceedings of the International Conference on Supercomputing,New Mexico,USA,2000,327~335
  • 6Ozer E.,Conte T.M.,Sharma S..Weld:A multithreading technique towards Latency-Tolerant VLIW processors.In:Proceedings of the 8th International Conference on High Performance Computing (HiPC' 01),Hyderabad,India,2001,192~203
  • 7Ozer E.,Toburen M.C.,Conte T.M..Dual-thread weld:A technique for latency tolerance in horizontal architectures.Department of Electrical and Computer Engineering,North Carolina State University,Raleigh:Technical Report NC 27695-7911,2001
  • 8Guthaus M.R.,Ringenberg J.S.,Ernst D.,Austin T.M.,Mudge T.,Brown R.B..MiBench:A free,commercially representative embedded benchmark suite.In:Proceedings of the 4th Annual IEEE Workshop on Workload Characterization,Austin,USA,2001,3~14
  • 9Lee C.,Potkonjak M.,Mangione-Smith W.H..MediaBench:A tool for evaluating and synthesizing multimedia and communications systems.In:Proceedings of the 30th Annual IEEE/ACM International Symposium on Microarchitecture,San Diego,USA,1997,330~335
  • 10TMS320C6000 CPU and Instruction Set Reference Guide.Texas Instruments Incorporated,1998

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部