摘要
由于自身特征较小以及网络的深度造成特征丢失等客观原因,小目标的检测一直是目标检测领域的难点问题。针对以上问题,提出基于网络结构进行多次特征增强以优化小目标检测的模型。首先,替换主干网络中的空间金字塔池化(SPP)以优化梯度计算;其次,对网络颈部实行区分特征级别的多级双向融合,并对输出头添加自适应特征融合(AFF)模块,以实现多级的特征增强。实验结果表明,在COCO2017-val数据集上,当交并比(IoU)为0.5时,所提模型的平均精度均值达到61.4%,与目前较流行的YOLOv7模型相比提高了4.7个百分点,同时在单GPU上模型的检测帧率为78.2 frame/s,满足工业检测速度要求。
Due to objective factors such as small inherent features and the depth of the network causing feature loss,the detection of small objects is always a challenging issue in the field of object detection.To address the above issues,a model for optimizing the detection of small objects was proposed based on multiple feature enhancements based on the network structure.Firstly,the optimization of gradient calculation was achieved by replacing Spatial Pyramid Pooling(SPP)in the backbone network.Secondly,a multi-level bidirectional fusion at the feature level and the addition of Adaptive Feature Fusion(AFF)module to the output head were employed in the network neck to achieve multi-level feature enhancement.Experimental results show that on COCO2017-val dataset,when the IoU(Intersection over Union)is 0.5,the average precision of the proposed model reaches 61.4%,which is 4.7 percentage points higher than that of the currently popular YOLOv7 model.At the same time,the detection frame rate of the proposed model with a single GPU is 78.2 frame/s,which is in line with industrial level detection speed.
作者
潘烨新
杨哲
PAN Yexin;YANG Zhe(School of Computer Science&Technology,Soochow University,Suzhou Jiangsu 215006,China;Jiangsu Provincial Key Laboratory for Computer Information Processing Technology(Soochow University),Suzhou Jiangsu 215006,China)
出处
《计算机应用》
CSCD
北大核心
2024年第9期2871-2877,共7页
journal of Computer Applications
基金
国家自然科学基金资助项目(62002253)
教育部产学合作协同育人项目(220606363154256)
国家级大学生创新创业训练计划项目(202210285042Z)。
关键词
深度学习
小目标
目标检测
计算机视觉
特征融合
deep learning
small object
object detection
computer vision
feature fusion