摘要
针对无人机视角下目标尺度变化剧烈、背景复杂、目标小且密集而导致漏检率高的问题,提出了一种改进YOLOv5s的实时目标检测模型。首先,提出了全新的混合注意力机制,将其嵌入到主干网络中,增强对目标关键信息的提取能力;其次,创建新的稠密残差金字塔池化,提高了网络信息融合能力,降低计算量;然后,设计了一种基于多头自注意力机制的C3-BoT模块,有效捕获无人机图像全局上下文信息;最后,根据无人机图像特点,在YOLOv5s网络基础上增加一个极小目标检测层,降低小目标的漏检率。在VisDrone2019数据集上进行实验,改进后模型的mAP 0.5达到了40.6%,较YOLOv5s基准模型提高了8.1个百分点,在无人机航拍图像检测任务中取得了更好的检测效果。
Aiming at the problem of high miss rate due to the drastic change of target scale,complex background,small and dense targets from the perspective of UAV,an improved YOLOv5s real-time target detection model is proposed.Firstly,a novel hybrid attention mechanism is introduced and embedded into the backbone network to enhance the extraction of crucial target information.Secondly,a new dense residual pyramid pooling is created to improve network information fusion capabilities while reducing computational cost.Then,a C3-BoT module based on multi-head self-attention mechanism is designed to effectively capture the global contextual information of UAV images.Finally,a specialized layer for detecting extremely small targets is added to the YOLOv5s network,specifically tailored to mitigate the issue of miss rate of small objects.Experimental results on the VisDrone2019 dataset show that the improved model achieves an mAP 0.5 of 40.6%,an improvement of 8.1 percentage points over the YOLOv5s baseline model,demonstrating superior detection performance in UAV aerial image tasks.
作者
宁涛
付世沫
常青
王耀力
NING Tao;FU Shimo;CHANG Qing;WANG Yaoli(School of Information and Computer,Taiyuan University of Technology,Taiyuan 030000,China;Taiyuan Water Supply Design and Research Institute Co.Ltd.,Taiyuan 030000,China)
出处
《电光与控制》
CSCD
北大核心
2024年第12期41-47,63,共8页
Electronics Optics & Control
基金
山西省自然科学基金(201801D121141)
山西省研发项目(201903D321003)
太原供水设计研究院有限公司项目(RH2000005391)。