期刊文献+

高性能YOLOv5:面向嵌入式平台高性能目标检测算法研究 被引量:2

High Performance YOLOv5:Research on High Performance Target Detection Algorithm for Embedded Platform
下载PDF
导出
摘要 针对目前深度学习单阶段检测算法综合性能不平衡以及在嵌入式设备难以部署等问题,该文提出一种面向嵌入式平台的高性能目标检测算法。基于只看1次5代(YOLOv5)网络,改进算法首先在主干网络部分采用设计的空间颈块代替原有的焦点模块,结合改进的混洗网络2代替换原有的跨级局部暗网络,减小空间金字塔池化(SPP)的内核尺寸,实现了主干网络的轻量化。其次,颈部采用了基于路径聚合网络(PAN)设计的增强型路径聚合网络(EPAN),增加了P6大目标输出层,提高了网络的特征提取能力。然后,检测头部分采用以自适应空间特征融合(ASFF)为基础设计的自适应空洞空间特征融合(A-ASFF)来替代原有的检测头,解决了物体尺度变化问题,在少量增加额外开销情况下大幅提升检测精度。最后,函数部分采用高效交并比(EIoU)代替完整交并比(CIoU)损失函数,采用S型加权线性单元(SiLU)代替HardSwish激活函数,提升了模型的综合性能。实验结果表明,与YOLOv5-S相比,该文提出的同版本算法在mAP@.5,mAP@.5:.95上分别提高了4.6%和6.3%,参数量降低了43.5%,计算复杂度降低了12.0%,在Jetson Nano平台上使用原模型和TensorRT加速模型进行速度评估,分别减少了8.1%和9.8%的推理延迟。该文所提算法的综合指标超越了众多优秀的目标检测网络,对嵌入式平台更为友好,具有实际应用意义。 Considering the problems of imbalanced comprehensive performance of the current deep learning single-stage detection algorithms and difficult deployment in embedded devices,one High-Performance object detection algorithm for embedded platforms is proposed in this paper.Based on the You Only Look Once v5(YOLOv5)network,in the backbone network part of the improved algorithm firstly,the original focus module and original Cross Stage Partial Darknet are replaced by a designed space stem block and an improved ShuffleNetv2,respectively.The kernel size of Space Pyramid Pooling(SPP)is reduced to lighten the backbone network.Secondly,in the neck,an Enhanced Path Aggregation Network(EPAN)based on Path Aggregation Network(PAN)design is adopted,a P6 large target output layer is added,and the feature extraction ability of the network is improved.And then,in the head,an Adaptive-Atrous Spatial Feature Fusion(A-ASFF)based on Adaptive Spatial Feature Fusion(ASFF)is used to replace the original detection head,the object scale change problem is solved,and the detection accuracy is greatly improved with a small amount of additional overhead.Finally,in the function section,a Complete Intersection over Union(CIoU)loss function is replaced by the Efficient Intersection over Union(EIoU),a HardSwish activation function is replaced by a Sigmoid weighted Linear Unit(SiLU),and model synthesis ability has been improved.The experimental results show that compared to YOLOv5-S,the mAP@.5 and mAP@.5:95 of the same version of the algorithm proposed in this paper are increased by 4.6%and 6.3%while the number of parameters and the computational complexity are reduced by 43.5%and 12.0%,respectively.Using the original model and the TensorRT accelerated model for speed evaluation on the Jetson Nano platform,the inference latency is reduced by 8.1%and 9.8%,respectively.The comprehensive indicators of many excellent object detection networks and their friendliness to embedded platforms are surpassed by the algorithm proposed in this paper and the practical meaning is generated.
作者 刘乔寿 赵志源 王均成 皮胜文 LIU Qiaoshou;ZHAO Zhiyuan;WANG Juncheng;PI Shengwen(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Advanced Network and Intelligent Connection Technology Key Laboratory of Chongqing Education Commission of China,Chongqing 400065,China;Chongqing Key Laboratory of Ubiquitous Sensing and Networking,Chongqing 400065,China)
出处 《电子与信息学报》 EI CSCD 北大核心 2023年第6期2205-2215,共11页 Journal of Electronics & Information Technology
关键词 目标检测 YOLOv5 混洗网络2代 自适应空间特征融合 嵌入式设备 TensorRT加速 Object detection YOLOv5 ShuffleNetv2 Adaptive Spatial Feature Fusion(ASFF) Embedded device TensorRT acceleration
  • 相关文献

参考文献3

二级参考文献20

共引文献147

同被引文献5

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部