摘要
针对以MobileNet为代表的轻量化卷积网络,基于现场可编程门阵列平台设计网络加速器。通过优化DW、PW轻量化模块并实现常用的卷积、ReLU等功能模块,满足神经网络加速器低功耗、低时延的要求,同时基于指令设计使加速器支持MobileNet及各类变种。利用上位机配置YoloV3 tiny(不含轻量模块)指令和YoloV3&MobileNet(含轻量模块)指令进行目标检测,实验结果表明,该网络加速器具有较快的推断速度,用于YoloV3tiny结构时达到85frame/s,用于YoloV3&MobileNet结构时达到62frame/s。
This paper designs a network accelerator based on the Field Programmable Gate Array(FPGA)platform for the lightweight convolutional network represented by MobileNet.By optimizing DW and PW lightweight modules and implementing commonly used convolution,ReLU and other functional modules,the neural network accelerator meets the requirements of low power consumption and low latency.At the same time,based on instruction-based design technology,the neural network accelerator supports MobileNet and its various variants.By configuring the target detection experiment of YoloV3 tiny(without lightweight modules)instructions and YoloV3&MobileNet(including lightweight modules)instructions on the host computer,the neural network accelerator can reach a faster inference speed.It reaches 85 frame/s for the YoloV3 tiny structure,reaches 62 frame/s for YoloV3&MobileNet structure.
作者
黄瑞
金光浩
李磊
姜文超
宋庆增
HUANG Rui;JIN Guanghao;LI Lei;JIANG Wenchao;SONG Qingzeng(School of Computer Science and Technology,Tianjin Polytechnic University,Tianjin 300387,China;Faulty of Computer,Guangdong University of Technology,Guangzhou 510006,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第9期185-190,196,共7页
Computer Engineering
基金
广东省自然科学基金(2018A030313061)
广东省科技计划项目(2017B010124001,201902020016,2019B010139001)。
关键词
硬件加速
模型压缩
轻量化神经网络
现场可编程门阵列
并行计算
hardware acceleration
model compression
lightweight neural network
Field Programmable Gate Array(FPGA)
parallel computing