摘要
由于无人机视角下的背景复杂,识别的目标多为远距离小目标,因此容易导致漏检及误检问题。为了实现无人机视角下对行人及车辆高精度识别,提出了以YOLOv7网络模型为基础的ST-YOLOv7算法,主干网络中融合了Swin Transform模块,构建复杂背景与小目标的全局关系,融入SENet通道注意力机制,为不同通道的特征分配不同权重,增强小目标特征的捕捉,在头部网络中,加入了YOLOv5网络中的C3模块,增加网络的深度和感受野,提高特征提取的能力,增加了1个小目标检测层,进一步提升对小目标识别的精度。实验证明:ST-YOLOv7网络模型在自制的航拍数据集中对行人的识别精度高达83.4%,对数据集中的车辆的识别精度达到了89.3%。均优于YOLOv5和YOLOv7目标检测算法,以较小的效率损失取得了较高精度。
Because of the complex background from the perspective of unmanned aerial vehicle(UAV),most of the identified targets are remote small targets,which easily leads to missed detection and false detection.In order to achieve high-precision recognition of pedestrians and vehicles from the perspective of UAV,ST-YOLOv7 algorithm based on YOLOv7 network model is proposed.Swin Transform module is integrated into the backbone network to construct the global relationship between complex background and small targets,and SENet channel attention mechanism is integrated to assign different weights to different channel features to enhance the capture of small target features.In the head network,C3 module in YOLOv5 network is added to increase the depth and receptive field of the network,improve the ability of feature extraction,and add a small target detection layer to further improve the accuracy of small target recognition.Experiments show that the ST-YOLOv7 network model has a high recognition accuracy of 83.4%for pedestrians in self-made aerial photography datasets,and the recognition accuracy of vehicles in the dataset reaches 89.3%.Both of them are superior to YOLOv5 and YOLOv7 target detection algorithms,and achieve higher accuracy with less efficiency loss.
作者
郝博
谷继明
刘力维
HAO Bo;GU Jiming;LIU Liwei(School of Mechanical Engineering and Automation Northeastern University,Shenyang 110819,China;School of Control Engineering,Northeastern University at Qinhuangdao,Qinhuangdao 066004,China)
出处
《兵器装备工程学报》
CAS
CSCD
北大核心
2024年第3期293-298,共6页
Journal of Ordnance Equipment Engineering
基金
装备预研领域基金重点项目(61409230125)
装备预先研究领域基金项目(80923020104)。