期刊文献+

一种通用型卷积神经网络加速器架构研究 被引量:2

A general-purpose accelerator for convolutional neural networks
下载PDF
导出
摘要 针对当前AI专用加速器设计复杂且存在内存瓶颈等不足,提出一种通用型卷积神经网络加速器架构.其RISC(Reduced Instruction Set Computer)指令集支持不同类型卷积神经网络到硬件加速器的高效映射.其通用卷积计算模块是一个由多个基本运算单元组成的可重构三维脉动阵列,支持不同尺寸的二维卷积计算;脉动阵列规模可根据需要进行配置,适用不同的并行加速需求.为缓解内存瓶颈、提高算力,输入模块引入多级缓存结构,可实现对片外数据的高速读取;输出模块设计一种基于“乒乓”架构的多级数据累加结构,以实现卷积计算结果的高速缓存输出.将所提架构在FPGA芯片上予以实现,实验结果表明该架构凭借较少计算资源和较低功耗取得了与当前先进加速器相近的性能,且通用性更强. A general-purpose CNN(Convolutional Neural Networks)accelerator is proposed to solve the problems of the current AI-specific accelerators such as complex design and memory bottlenecks.Its RISC(Reduced Instruction Set Computer)instruction set supports efficient mapping of different types of convolutional neural networks to hardware accelerators.The convolution calculation module is a reconfigurable 3D systolic array composed of multiple basic operation units,which supports two-dimensional convolution calculations of different sizes.The scale of the 3D systolic array can be configured according to needs,which is suitable for different parallel acceleration requirements.To further improve the computing power of the accelerator by easing its memory bottlenecks,a multi-level cache structureis introduced into the input module,which can realize high-speed reading of off-chip data.A multi-level data accumulation structure based on the ping-pong architecture is designed to realize the cache of convolution calculation results in the output module.The proposed architecture is implemented on an FPGA chip,and the experimental results show that the architecture achieves competitive performance with less computing resources and lower power consumption,and is more versatile.
作者 董刚 胡克坤 杨宏斌 赵雅倩 李仁刚 赵坤 曹其春 鲁璐 DONG Gang;HU Kekun;YANG Hongbin;ZhAO Yaqian;LI Rengang;ZHAO Kun;CAO Qichun;LU Lu(Inspur Electronic Information Industry Co.,Ltd,Jinan 250013,China;Guangdong Inspur big data research Co.,Ltd,Guangzhou 510632,China)
出处 《微电子学与计算机》 2023年第5期97-103,共7页 Microelectronics & Computer
基金 山东省重点研发计划项目(2019TSLH0201) 山东省自然科学基金创新发展联合基金(ZR2021LZH004)。
关键词 AI加速器 卷积神经网络 多尺寸卷积核 三维脉动阵列 多级累加结构 AI-specific accelerators CNN multi-scale convolution kernel 3D systolic array multi-level data accumulator
  • 相关文献

参考文献2

二级参考文献1

共引文献20

同被引文献6

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部