摘要
飞速发展的神经网络已经在目标检测等领域取得了巨大的成功,通过神经网络推理框架将网络模型高效地自动部署在各类边缘端设备上是目前重要的研究方向。针对以上问题,该文设计一个针对边缘端FPGA的神经网络推理框架NN-EdgeBuilder,能够利用基于多目标贝叶斯优化的设计空间探索算法充分探索网络每层的并行度因子和量化位宽,接着调用高性能且通用的硬件加速算子来生成低延迟、低功耗的神经网络加速器。该文使用NN-EdgeBuilder在Ultra96-V2 FPGA上部署了UltraNet和VGG网络,生成的UltraNet-P1加速器与最先进的Ul-traNet定制加速器相比,功耗和能效比表现分别提升了17.71%和21.54%。与主流的推理框架相比,NN-Edge-Builder生成的VGG加速器能效比提升了4.40倍,数字信号处理器(DSP)的计算效率提升了50.65%。
The rapidly developing neural network has achieved great success in fields such as target detection.Currently,an important research direction is to deploy efficiently and automatically network models on various edge devices through a neural network inference framework.In response to these issues,a neural network inference framework NN-EdgeBuilder for edge FPGA is designed in this paper,which can fully explore the parallelism factors and quantization bit widths of each layer of the network through a design space exploration algorithm based on multi-objective Bayesian optimization.Then high-performance and universal hardware acceleration operators are called to generate low-latency and low-power neural network accelerators.NN-EdgeBuilder is used to deploy UltraNet and VGG networks on Ultra96-V2 FPGA in this study,and the generated UltraNet-P1 accelerator improves power consumption and energy efficiency by 17.71%and 21.54%,respectively,compared with the state-of-the-art UltraNet custom accelerator.Compared with mainstream inference frameworks,energy efficiency of the VGG accelerator generated by NN-EdgeBuilder is improved by 4.40 times and Digital Signal Processor(DSP)computing efficiency is improved by 50.65%.
作者
张萌
张雨
张经纬
曹新野
李鹤
ZHANG Meng;ZHANG Yu;ZHANG Jingwei;CAO Xinye;LI He(School of Electronic Sci.and Eng.,Southeast University,Nanjing 210096,China)
出处
《电子与信息学报》
EI
CSCD
北大核心
2023年第9期3132-3140,共9页
Journal of Electronics & Information Technology
基金
广东省重点领域研发计划(2021B1101270006),江苏省自然科学基金(BK20201145)。
关键词
神经网络推理框架
设计空间探索
多目标贝叶斯优化
硬件加速算子
Neural network inference framework
Design space exploration
Multi-objective Bayesian optimization
Hardware acceleration operators