期刊文献+

高性能机器学习SIMT处理器的调度机制设计与实现 被引量:2

Design and implementation of scheduling mechanism for high performance machine learning SIMT processor
下载PDF
导出
摘要 针对面向机器学习的高性能单指令多线程(Single Instruction Multiple Threads,SIMT)处理器提出了结构简单且高效的调度机制,支持4个区块、8个warp、64个线程的并行运算,并采用两种可配置调度模式相结合的动态调度方式.该设计使用可综合的Verilog HDL语言实现其硬件电路,并搭建基于FPGA的验证平台对整体电路进行功能验证,结果表明,本文设计的调度机制满足SIMT处理器需求,且该调度机制使得处理器整体性能提升了82.17%.在Xilinx公司的FPGA芯片xcvu440-flga-2892-2-e上综合最大时钟频率可达到181 MHz. Aiming at the high-performance Single Instruction Multiple Threads(SIMT) processor for machine learning, a simple and efficient scheduling mechanism is proposed, which supports parallel operation of 4 blocks, 8 warps and 64 threads. The dynamic scheduling method combines two configurable scheduling modes. The design uses the synthesizable Verilog HDL language to implement its hardware circuit, and builds an FPGA-based verification platform to verify the function of the whole circuit. The results show that the scheduling mechanism designed in this paper meets the requirements of SIMT processor, and the overall performance of the processor is increased by 82.17%. The integrated maximum clock frequency can reach 181 MHz on Xilinx′s FPGA chip xcvu440-flga-2892-2-e.
作者 贾蕊 李涛 冯臻夫 张宏伟 JIA Rui;LI Tao;FENG Zhen-fu;ZHANG Hong-wei(School of Electronic Engineering, Xi′an University of Posts and Telecommunications,Xi’an 710121,China;School of Computer, Xi′an University of Posts and Telecommunications,Xi’an 710121,China)
出处 《微电子学与计算机》 北大核心 2019年第9期67-72,共6页 Microelectronics & Computer
基金 陕西省重点研发计划(2017ZDXM-GY-005) 西安市科技局项目(201805040YD18CG24(5))
关键词 机器学习 SIMT处理器 SIMT调度机制 多线程并行处理 动态调度 machine learning SIMT processor SIMT scheduling mechanism multi-thread parallel processing dynamic scheduling
  • 相关文献

参考文献4

二级参考文献31

  • 1Wei-WuHu Fu-XinZhang Zu-SongLi.Microarchitecture of the Godson-2 Processor[J].Journal of Computer Science & Technology,2005,20(2):243-249. 被引量:52
  • 2刘近光,梁满贵.多核多线程处理器的发展及其软件系统架构[J].微处理机,2007,28(1):1-3. 被引量:22
  • 3M. G. H. Katevenis, Fast switching and fair control of congested flow in broadband networks,IEEE on Selected Areas Comm., 1987, SAC-5(8), 1315-1326.
  • 4The Programmable Logic Data Book 2000, Xilinx Incorporation, 2000, Section 3.
  • 5Compton K,Hauk S.Reconfigurable computing:A survey of systems and software[J].ACM Computing Surveys,2002,34(2):171-210.
  • 6Hideharu A.A survey on dynamically reconfigurable processors[J].IEICE Transactions on Communications,2006,E89-B(12):3179-3187.
  • 7Flynn M.Some computer organizations and their effectiveness[J].IEEE Transactions on Computers,1972,21 (9):948.
  • 8Shen X B.Evolution of MPP SoC architecture techniques[J].Science in China-Series F:Information Science,2008,51(6):756-764.
  • 9Hillis D.New computer architectures and their relationship to physics or why CS is no good[J].International Journal of Theoretical Physics,1982,21(3/4):255-262.
  • 10Quinn M J.Parallel programming in C with MPI and OpenMP[M].NY:McGraw-Hill,2004.

共引文献29

同被引文献28

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部