摘要
基于卷积神经网络的目标检测算法发展迅速,随着计算复杂度增加,对设备的性能及功耗要求越来越高。为了使目标检测算法能够部署在嵌入式设备上,本文采用软硬件协同设计方法,使用FPGA对算法进行硬件加速,提出了ZYNQ平台下的Yolo v3-SPP目标检测系统。本文将该系统部署在XCZU15EG芯片上,并对系统所需的功耗、硬件资源及性能进行了分析。首先对要部署的网络模型进行优化,并在Pascal VOC 2007数据集上进行训练,最后使用Vitis AI工具对训练后的模型进行量化、编译,使其适用于ZYNQ端的部署。为了选取最佳的配置方案,探究了各配置对硬件资源及系统性能的影响,从系统功耗(W)、检测速度(FPS)、各类别平均精度的平均值(mAP)、输出误差等方面对系统进行了分析。结果表明:在300 M时钟频率下,输入图片大小为(416,416)时,针对Yolo V3-SPP和Yolo V3-Tiny网络结构,检测速度分别为38.44 FPS和177FPS,mAP分别为80.35%和68.55%,片上芯片功耗为21.583 W,整板功耗23.02 W。满足嵌入式设备部署神经网络模型的低功耗、实时性、高检测精度等要求。
The target detection algorithm based on the convolutional neural network is developing rapidly,and with the increase in computational complexity,requirements for device performance and power consumption are increasing.To enable the target detection algorithm to be deployed on embedded devices,this study proposes a Yolo v3-SPP target detection system based on the ZYNQ platform by using a hardware and software co-design approach and hardware acceleration of the algorithm through FPGA.The system is deployed on the XCZU15EG chip,and the required power consumption,hardware resources,and performance of the system are analyzed.The network model to be deployed is first optimized and trained on the Pascal VOC 2007 dataset,and finally,the trained model is quantified and compiled using the Vitis AI tool to make it suitable for deployment on the ZYNQ platform.To select the best configuration scheme,the impact of each configuration on hardware resources and system performance is explored.The system power consumption(W),detection speed(FPS),mean value of average precision(mAP)for each category,output error,etc.are also analyzed.The experimental results show that the detection speed is 38.44 FPS and 177 FPS for Yolo V3-SPP and Yolo V3-Tiny network structures,respectively,with mAPs of 80.35%and 68.55%,on-chip power consumption of 21.583 W,and board power consumption of 23.02 W at 300 M clock frequency and input image size of(416,416).This shows that the proposed target detection system meets the requirements of embedded devices for deploying neural network models with low power consumption,real-time,and high detection accuracy.
作者
张丽丽
陈真
刘雨轩
屈乐乐
ZHANG Lili;CHEN Zhen;LIU Yuxuan;QU Lele(College of Electronic and Information Engineering,Shenyang Aerospace University,Shenyang 110000,China)
出处
《光学精密工程》
EI
CAS
CSCD
北大核心
2023年第4期543-551,共9页
Optics and Precision Engineering
基金
国家自然科学基金资助项目(No.61671310)
辽宁省兴辽英才计划项目基金资助项目(No.XLYC1907134)
辽宁省教育厅项目资助(No.LJKZ0174)。