期刊文献+

一种基于NoC多核系统的矩阵乘法映射技术 被引量:1

A Matrix Multiplication Mapping Technology Based on NOC Multi-Core System
下载PDF
导出
摘要 矩阵乘法是现代信号处理的基本运算,提高数据的并行处理能力对提升矩阵乘法的运算性能具有重要现实意义。文中在基于NoC多核系统中针对不同维度的矩阵乘法的密集型计算进行任务调度以及资源分配,实现了多种适应于不同矩阵乘法的映射方案,其峰值性能可达5078 MFLOPS。同时,文中设计的运算单元相对独立且可重构,对任意维度的矩阵乘法具有良好的扩展性和通用性,解决了通用矩阵乘法器在固定结构中受到I/O带宽和计算资源的限制而产生的运算效率较低和扩展性较差的缺陷。不同维度矩阵乘法的实验结果分析证实了文中设计的运算性能和正确性。 Matrix multiplication is the basic operation of modern signal processing.Improving the parallel processing capacity of data has important practical significance for improving the operation performance of matrix multiplication.In this study,task scheduling and resource allocation are carried out for the intensive computing of matrix multiplication in different dimensions based on NOC multi-core system,and a variety of mapping algorithms suitable for different matrix multiplication are implemented,and the peak performance can reach 5078 MFLOPS.The designed operation unit is relatively independent and reconfigurable,which has good expansibility and generality for matrix multiplication of any dimension.It overcomes the limitation of I/O bandwidth and computing resources in fixed structure,which leads to low efficiency and poor expansibility.Through the analysis of the experimental results of matrix multiplication of different dimensions,the correctness and high performance of the design are verified.
作者 汪杨 王晓蕾 袁子昂 袁儒明 WANG Yang;WANG Xiaolei;YUAN Ziang;YUAN Ruming(School of Electronic Science and Applied Physics,Hefei University of Technology,Hefei 230009,China)
出处 《电子科技》 2021年第5期54-60,共7页 Electronic Science and Technology
基金 国家自然科学基金(61874156)。
关键词 矩阵乘法 并行计算 NoC多核 密集型 任务调度 资源分配 通用性 I/O带宽 matrix multiplication parallel computing NoC multi-core intensive task scheduling resource allocation generality I/O bandwidth
  • 相关文献

参考文献8

二级参考文献57

  • 1孙利荣,蒋泽军,王丽芳.片上网络[J].计算机工程,2005,31(20):1-2. 被引量:5
  • 2荆元利,樊晓桠.网络互连多线程处理器[J].计算机工程与应用,2005,41(33):51-53. 被引量:2
  • 3欧阳璟,常政.多核,瓶颈在软件[J].程序员,2006(9):42-46. 被引量:4
  • 4闫辉.多核是软件开发行业的迁移目标——专访英特尔软件产品事业部业务拓展和市场总监James Reinder[J].程序员,2006(9):47-47. 被引量:3
  • 5马关胜,冯刚.SoC设计与P核重用技术[M].北京:国防工业出版社,2006.
  • 6KANGM L,SE J L, HOI J Y. A dis - tributed crossbar switch, switch scheduler for on - chipnetworks [ C ]. IEEE Proceedings of Custom Integrated Circuits Conference,2003:671 -674.
  • 7RAKESH K,DEAN M T,NORMAN P J. Heterogeneous chip multiprocessors [J]. IEEE Computer Society Press Los Alamitos ,2005,38 ( 11 ) : 32 - 38.
  • 8VALDERRAMA C A, CHANGUEL A, JERRAYA A. Virtual prototyping for modular and flexible hareware - software sys- tems [ J]. Design Automation for embedded systems, 1997,2 ( 3 ) :267 - 282.
  • 9HEMANI A, JANTSCH A, KUMAR S, et al. Network on a chip:architecture for billion transisitor era [ C ]. In Proceed- ing of the IEEE Norchip Conference,2000:711 - 726.
  • 10UNDERWOOD K. FPGAs vs. CPUs: trends in peak floating-point performance [C] // Proceedings of the International Symposium on Field Programmable Gate Arrays. Monterey: ACM , 2004: 171- 180.

共引文献38

同被引文献2

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部