期刊文献+

基于异构SoC卷积神经网络加速器的设计与实现 被引量:1

Design and Implementation of Convolutional Neural Network Accelerator Based on Heterogeneous SoC
下载PDF
导出
摘要 针对卷积神经网络在硬件资源受限的移动边缘端设备上运行慢的问题,提出一种基于异构SoC的卷积神经网络差异化量化加速系统。首先提出一种差异化量化方法,针对深度卷积神经网络ResNet-50不同层进行不同程度的量化。其次采用HLS高层次综合工具完成FPGA硬件加速模块编码。最后针对ResNet-50在ImageNet数据集上进行不同量化方案的精度和加速比测试。实验结果表明与在ARM下的推理时间相比,该异构加速系统的最小加速比为2.86,最大加速比为11.43。其中,最小加速比与最大加速比为3.99倍,Top1精度比值为1.044,精度损失相对百分比为4.22%。 To solve the problem of convolutional neural network running slowly on mobile edge devices with limited hardware resources,this paper proposes a differential quantization acceleration system of convolutional neural network based on heterogeneous SoC.Firstly,a differential quantization method is proposed to quantify the different layers of the deep convolutional neural network ResNet-50.Secondly,HLS highlevel synthesis tool is used to complete FPGA hardware acceleration module coding.Finally,the accuracy and speedup ratio of different quantization schemes are tested on ImageNet dataset for ResNet-50.The experimental results show that the minimum speedup is 2.86 and the maximum speedup is 11.43.Among them,the minimum speedup ratio is 3.99 times of the maximum speedup ratio,the Top1 precision ratio is 1.044,and the relative percentage of precision loss is 4.22%.
作者 曾春明 ZENG Chunming(College of Computer Science,Sichuan University,Chengdu 610065)
出处 《现代计算机》 2021年第9期3-7,共5页 Modern Computer
基金 国家自然科学基金资助项目(No.61332001)。
关键词 卷积神经网络 异构SoC 差异化量化 现场可编程门阵列 Convolutional Neural Network Heterogeneous SoC Differential Quantization Field Programmable Gate Array
  • 相关文献

参考文献1

共引文献10

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部