期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
FPGA Optimized Accelerator of DCNN with Fast Data Readout and Multiplier Sharing Strategy 被引量:1
1
作者 Tuo Ma Zhiwei Li +3 位作者 Qingjiang Li Haijun Liu zhongjin zhao Yinan Wang 《Computers, Materials & Continua》 SCIE EI 2023年第12期3237-3263,共27页
With the continuous development of deep learning,Deep Convolutional Neural Network(DCNN)has attracted wide attention in the industry due to its high accuracy in image classification.Compared with other DCNN hard-ware ... With the continuous development of deep learning,Deep Convolutional Neural Network(DCNN)has attracted wide attention in the industry due to its high accuracy in image classification.Compared with other DCNN hard-ware deployment platforms,Field Programmable Gate Array(FPGA)has the advantages of being programmable,low power consumption,parallelism,and low cost.However,the enormous amount of calculation of DCNN and the limited logic capacity of FPGA restrict the energy efficiency of the DCNN accelerator.The traditional sequential sliding window method can improve the throughput of the DCNN accelerator by data multiplexing,but this method’s data multiplexing rate is low because it repeatedly reads the data between rows.This paper proposes a fast data readout strategy via the circular sliding window data reading method,it can improve the multiplexing rate of data between rows by optimizing the memory access order of input data.In addition,the multiplication bit width of the DCNN accelerator is much smaller than that of the Digital Signal Processing(DSP)on the FPGA,which means that there will be a waste of resources if a multiplication uses a single DSP.A multiplier sharing strategy is proposed,the multiplier of the accelerator is customized so that a single DSP block can complete multiple groups of 4,6,and 8-bit signed multiplication in parallel.Finally,based on two strategies of appeal,an FPGA optimized accelerator is proposed.The accelerator is customized by Verilog language and deployed on Xilinx VCU118.When the accelerator recognizes the CIRFAR-10 dataset,its energy efficiency is 39.98 GOPS/W,which provides 1.73×speedup energy efficiency over previous DCNN FPGA accelerators.When the accelerator recognizes the IMAGENET dataset,its energy efficiency is 41.12 GOPS/W,which shows 1.28×−3.14×energy efficiency compared with others. 展开更多
关键词 FPGA ACCELERATOR DCNN fast data readout strategy multiplier sharing strategy network quantization energy efficient
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部