基于低成本FPGA的深度卷积神经网络加速器设计

Design of deep convolutional neural network accelerator based on low-cost FPGA

下载PDF

导出

摘要现有的深度卷积神经网络在推理过程中产生大量的层间特征数据。为了在嵌入式系统中保持实时处理,需要大量的片上存储来缓存层间特征映射。本文提出了一种层间特征压缩技术,以显著降低片外存储器访问带宽。此外,本文针对FPGA中BRAM的特点提出了一种通用性的卷积计算方案,并从电路层面做出了优化,既减少了访存次数又提高了DSP的计算效率,从而大幅提高了计算速度。与CPU运行MobileNetV2相比,文章提出的深度卷积神经网络加速器在性能上提升了6.3倍;与同类型的DCNN加速器相比,文章提出的DCNN加速器在DSP性能效率上分别提升了17%和156%。 Existing DCNN generate a large amount of inter-layer feature data during inference.To maintain real-time processing on embedded systems,a significant amount of on-chip storage is required to cache inter-layer feature maps.This paper proposes an inter-layer feature compression technique to significantly reduce off-chip memory access bandwidth.Additionally,a generic convolution computation scheme tailored for BRAM in FPGA is proposed,with optimizations made at the circuit level to reduce memory accesses and improve DSP computational efficiency,thereby greatly enhancing computation speed.Compared to running MobileNetV2 on a CPU,the proposed DCNN accelerator in this paper achieves a performance improvement of 6.3 times;compared to other DCNN accelerators of the same type,the proposed DCNN accelerator in this paper achieves DSP performance efficiency improvements of 17%and 156%,respectively.

作者杨统肖昊 Yang Tong;Xiao Hao(School of Microelectronics,Hefei University of Technology,Hefei 230601,China)

机构地区合肥工业大学微电子学院

出处《电子测量技术》北大核心 2024年第10期184-190,共7页 Electronic Measurement Technology

基金国家自然科学基金(61974039)项目资助。

关键词深度卷积神经网络现场可编程门阵列深度学习 deep convolutional neural network field programmable gate array deep learning

分类号 TN46 [电子电信—微电子学与固体电子学]

引文网络
相关文献

1曾浩,韩术,何伟.基于FPGA平台的数字电子技术实验开发[J].科教导刊,2023(4):63-66. 被引量：1
2赵立东,胡侨娟,刘彦.高性能的云端推理AI加速器设计[J].集成电路应用,2024,41(7):1-3.
3李静茹.电力通信网络中深度学习异常检测算法优化[J].中国宽带,2023,19(10):155-157.
4荀小伟,许昕,潘宏侠.基于DCNN和Bi-LSTM的弧齿锥齿轮箱故障诊断[J].电子测量技术,2024,47(10):48-55.
5周锐,王权,王茜.部件参数对重型车CO_(2)排放的影响研究[J].质量与标准化,2024(7):62-66.
6田安国,刘瑞,吕东芳,周人伟,王兆义.可用于自保温外墙的陶粒泡沫混凝土生产工艺试验研究[J].建筑技术,2024,55(17):2151-2153.

电子测量技术

2024年第10期

浏览历史

内容加载中请稍等...

基于低成本FPGA的深度卷积神经网络加速器设计

相关作者

相关机构

相关主题

浏览历史