期刊文献+

基于MPSoC并行调度的矩阵乘法加速算法研究 被引量:4

Research on Acceleration of Matrix Multiplication Based on Parallel Scheduling on MPSoC
下载PDF
导出
摘要 矩阵乘法是数值分析以及图形图像处理算法的基础,通用的矩阵乘法加速器设计一直是嵌入式系统设计的研究热点。但矩阵乘法由于计算复杂度高,处理效率低,常常成为嵌入式系统运算速度的瓶颈。为了在嵌入式领域更好地使用矩阵乘法,提出了基于MPSoC(MultiProcessor System-on-Chip)的软硬件协同加速的架构。在MPSoC的架构下,一方面,设计了面向硬件约束的矩阵分块方法,从而实现了通用的矩阵乘法加速器系统;另一方面,通过利用MPSoC下的多核架构,提出了相应的任务划分和负载平衡调度算法,提高了并行效率和整体系统加速比。实验结果表明,所提架构及算法实现了通用的矩阵乘法计算,并且通过软硬件协同设计实现的多核并行调度算法与传统单核设计相比在计算效率方面得到了显著的提高。 Matrix multiplication is the basic algorithm of the numerical analysis,graphics and image processing.General matrix multiplication accelerator has always been a research focus in the embedded system design.However,due to the high complexity and the low processing efficiency,matrix multiplication becomes the bottleneck of computation speed of embedded systems.In order to use matrix multiplication in the embedded field,a synergy acceleration architecture of software and hardware based on MPSoC was proposed in this paper.With MPSoC architecture,the partitioning of the matrix considering hardware constraints is implemented in our HW/SW system to enable the computation of general matrix multiplications.The parallel computation with multiple cores and hardware function unit is realized with the load balance algorithms.Parallel efficiency and speed-up ratio are improved.The experimental results show that the proposed general matrix multiplication approach can achieve significant speed-up over the traditional approaches with single core.
作者 杨飞 马昱春 侯金 徐宁 YANG Fei MA Yu-chun HOU Jin XU Ning(Hubei Key Laboratory of Intelligent Wireless Communications, South-central University for Nationalities,Wuhan 430074, China Department of Computer Science & Technology, Tsinghua University, Beijing 100084, China Hubei Key Laboratory of Transportation Internet of Things,Wuhan University of Teehnology,Wuhan 430074,China)
出处 《计算机科学》 CSCD 北大核心 2017年第8期36-41,共6页 Computer Science
基金 European Union Seventh Framework Programme(318521) 国家自然科学基金面上项目(61076035)资助
关键词 矩阵乘法 MPSOC 并行计算 负载平衡 Matrix multiplication MPSoC Parallel computation Load balance
  • 相关文献

参考文献4

二级参考文献58

  • 1熊志辉,李思昆,陈吉华.遗传算法与蚂蚁算法动态融合的软硬件划分[J].软件学报,2005,16(4):503-512. 被引量:87
  • 2熊志辉,李思昆,陈吉华,王海力,边计年.支持平台设计方法的系统芯片协同设计环境[J].计算机辅助设计与图形学学报,2005,17(7):1401-1406. 被引量:4
  • 3宾雪莲,杨玉海,金士尧.一种基于分组与适当选取策略的实时多处理器系统的动态调度算法[J].计算机学报,2006,29(1):81-91. 被引量:17
  • 4苏明 薛宏熙 等.调度问题的形式化描述[J].计算机辅助设计与图形学学报,1995,7(4):283-288.
  • 5Wang D W, Li S K, Dou Y. Collaborative hardware/software partition of coarse-grained reconfigurable system using evolutionary ant Colony optimization[C] //Proceedings of the Asia and South Pacific Design Automation Conference, Seoul, 2008:679-684.
  • 6Wang G, Gong W R, Derenzi B, et al. Exploring time/ resource trade-offs by solving dual scheduling problems with the ant colony optimization [J]. ACM Transactions on Design Automation of Electronic Systems, 2007, 12(4): Artide No. 46.
  • 7Wang G, Gong W R, Kastner R. Application partitioning on programmable platforms using the ant colony optimization [J]. Journal of Embedded Computing, 2006, 2(1): 119-136.
  • 8Wang G, Gong W R, DeRenzi B, et al. Design space exploration using time and resource duality with the ant colony optimization [C] //Proceedings of the 43rd Annual ACM IEEE Design Automation Conference, San Francisco, 2006: 451- 454.
  • 9Mudry P A, Zufferey G, Tempesti G. A hybrid genetic algorithm for constrained hardware-software partitioning [C]//Proceedings of IEEE Design and Diagnostics of Electronic Circuits and Systems, Prague, 2006 : 1-6.
  • 10Vivekanandarajah K, Pilakkat S K. Task mapping in heterogeneous MPSoCs for system level design [C]// Proceedings of the 13th IEEE International Conference on Engineering of Complex Computer Systems, Beflast, 2008: 56-65.

共引文献3

同被引文献41

引证文献4

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部