摘要
针对低信噪比与复杂任务场景下,YOLOv8模型对红外遮挡目标和弱小目标检测能力不足的问题,提出了改进的DCS-YOLOv8模型(DCN_C2f-CA-SIoU-YOLOv8)的目标检测方法。以YOLOv8框架为基础,主干网络构建了基于可变形卷积的轻量级DCN_C2f(Deformable Convolution Network)模块,自适应调整网络的视觉感受野,提高目标多尺度特征表示能力。特征融合网络引入基于坐标注意力机制CA(Coordinate Attention)的模块,通过捕捉多目标空间位置依赖关系,提高目标的定位准确性。改进基于SIoU(Scylla IoU)的位置回归损失函数,实现预测框与真实框之间的相对位移方向匹配,加快模型收敛速度并提升检测与定位精度。实验结果表明,相较于YOLOv8-n\s\m\l\x系列模型,DCS-YOLOv8在FLIR、OTCBVS与VEDAI测试集上平均精度均值mAP@0.5平均提高了6.8%、0.6%、4.0%,分别达到86.5%、99.0%与75.6%。同时,模型的推理速度满足红外目标检测任务的实时性要求。
In response to the challenges posed by low signal-to-noise ratios and complex task scenarios,an improved detection method called DCS-YOLOv8(DCN_C2f-CA-SIoU-YOLOv8)is proposed to address the insufficient infrared occluded object detection and weak target detection capabilities of the YOLOv8 model.Building on the YOLOv8 framework,the backbone network incorporates a lightweight deformable convolution network(DCN_C2f)module based on deformable convolutions,which adaptively adjusts the network's visual receptive field to enhance the multi-scale feature representation of objects.The feature fusion network introduces the coordinate attention(CA)module based on coordinate attention mechanisms to capture spatial dependencies among multiple objects,thereby improving the object localization accuracy.Additionally,the position regression loss function is enhanced using Scylla IoU to ensure a relative displacement direction match between the predicted and ground truth boxes.This improvement accelerates the model convergence speed and enhances the detection and localization accuracy.The experimental results demonstrate that DCS-YOLOv8 achieves significant improvements in the average precision of the FLIR,OTCBVS,and VEDAI test sets compared to the YOLOv8-n\s\m\l\x series models.Specifically,the average mAP@0.5 values are enhanced by 6.8%,0.6%,and 4.0%respectively,reaching 86.5%,99.0%,and 75.6%.Furthermore,the model's inference speed satisfies the real-time requirements for infrared object detection tasks.
作者
沈凌云
郎百和
宋正勋
温智滔
SHEN Lingyun;LANG Baihe;SONG Zhengxun;WEN Zhitao(Department of Electronic Engineering,Taiyuan Institute of Technology,Taiyuan 030008,China;Sch.of Elec.and Info.Engineering,Changchun University of Science and Technology,Changchun 130022,China;Overseas Expertise Introduction Project for Discipline Innovation D17017,Changchun 130022,China)
出处
《红外技术》
CSCD
北大核心
2024年第5期565-575,共11页
Infrared Technology
基金
山西省引进人才科技创新启动基金(21010123)
山西省高等院校大学生创新项目(S202314101195)
吉林省科技发展计划基金项目(YDZJ202102CXJD007)。
关键词
红外图像
目标检测
注意力机制
可变形卷积
多尺度特征
infrared images
object detection
attention mechanism
deformable convolution
multi-scale features