摘要
航拍图像目标检测对于高效解译航拍图像,并用于地图绘制、资源普查、城乡规划等领域有着重大现实意义。针对无人机航拍图像中的物体尺度变化大、易受到背景干扰和微小目标容易错检漏检的问题,提出一种基于YOLOv7进行改进的航拍图像目标检测算法(AirYOLOv7)。AirYOLOv7通过在原网络的特征提取阶段结合三维注意力机制,在特征融合阶段结合通道注意力机制,以帮助模型更好地聚焦于图像中的关键信息。考虑到航拍图像中存在许多微小物体,算法额外增加了一个用于检测微小物体的预测头,并在每个预测头前引入C3STB,以增强算法对不同尺度目标的检测能力。针对IoU损失对微小物体的位置偏差非常敏感,通过在原边框回归损失中引入Wasserstein距离来衡量微小物体之间的差异,以提高算法对微小物体的检测能力。实验结果表明,AirYOLOv7在DOTA和VisDrone这两个公开的光学航拍数据集上的mAP分别达到78.65%和51.79%,相较于原始的YOLOv7分别提高了1.92个百分点和2.28个百分点,证明了改进方法在光学航拍图像上的有效性。
Aerial image target detection has significant practical implications for efficient interpretation of aerial images and applications in mapping,resource inventory,urban and rural planning,etc.To address challenges in UAV aerial images,such as varying object scales,background interference,and missing detection of small targets,propose an improved algo-rithm called AirYOLOv7,based on YOLOv7.Firstly,AirYOLOv7 combines a three-dimensional attention mechanism during feature extraction and a channel attention mechanism during feature fusion in the original network.These mecha-nisms help the model focus on crucial information in the image.Secondly,because of the prevalence of small objects in aerial images,the algorithm adds an additional prediction head for detecting small objects.The algorithm also incorpo-rates the C3STB before each prediction head to improve detection capability for objects of different scales.Additionally,the algorithm addresses the sensitivity of the IoU loss to positional deviations for small objects by introducing the Wasser-stein distance into the original bounding box regression loss.This measure helps improve the detection capability for small objects.Experimental results demonstrate that the effectiveness of AirYOLOv7 on two publicly available optical aerial datasets,DOTA and VisDrone achieves mean average precision of 78.65%and 51.79%on these datasets,respectively,showing improvements of 1.92 percentage points and 2.28 percentage points comparing to the original YOLOv7 which validates the effectiveness of the proposed improvements on optical aerial images.
作者
邹振涛
李泽平
ZOU Zhentao;LI Zeping(State Key Laboratory of Public Big Data,Guiyang 550025,China;School of Computer Science and Technology,Guizhou University,Guiyang 550025,China)
出处
《计算机工程与应用》
CSCD
北大核心
2024年第8期173-181,共9页
Computer Engineering and Applications
基金
国家自然科学基金(61462014)。