期刊文献+

基于FPGA的SqueezeNet推断加速器设计

Design of FPGA-Based SqueezeNet Inference Accelerator
下载PDF
导出
摘要 针对轻量型深度神经网络SqueezeNet存在中间流动数据量大及消耗计算周期长等问题,文中提出以处理块结构划分整个网络来加速计算。每个处理块由Expand层和Squeeze层组成。以Squeeze层结束的处理块结构减少了计算模块与内存间流动的中间数据量,降低了读写消耗。利用激活函数的特性,在核心计算模块引入提前结束卷积计算技术,并为其设计有效索引生存单元、有效索引控制取值单元和卷积判断单元,可跳过卷积计算中无效值占用的计算量和计算周期。实验结果表明,该加速器能减少55.38%的数据流动量,并将无效值所占的计算量和计算周期减少14.68%。 In view of the problems of the lightweight deep neural network SqueezeNet,such as large amount of intermediate data and long consumption calculation cycle,this study proposes to divide the entire network with a process block structure to speed up the calculation.Each process block is composed of Expand layer and Squeeze layer.The processing block structure ending with the Squeeze layer reduces the amount of intermediate data flowing between the computing module and the memory,and reduces the read and write consumption.The core calculation module introduces the early termination of the convolution calculation technology using the characteristics of the activation function.The effective index survival unit,the effective index control value unit and the convolution judgment unit are designed to skip the calculation amount and calculation cycle occupied by invalid values in the convolution calculation.Experimental results show that the data flow of the accelerator is reduced by 55.38%,and the calculation amount and calculation period occupied by invalid values are reduced by 14.68%.
作者 储萍 倪伟 CHU Ping;NI Wei(School of Electronic Science and Applied Physics,Hefei University of Technology,Hefei 230009,China)
出处 《电子科技》 2022年第2期20-26,共7页 Electronic Science and Technology
基金 安徽高校协同创新项目(PA2019AGXC0127)。
关键词 轻量型深度网络 SqueezeNet 处理块 激活函数 提前结束卷积计算 有效索引 无效值 计算周期 lightweight deep neural network SqueezeNet process block activation function early termination of the convolution calculation effective index invalid value calculation period
  • 相关文献

参考文献3

二级参考文献10

共引文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部