摘要
深度目标检测模型的性能优势主要受益于主干网络的特征表达能力,其中的下采样操作是执行语义集成的关键步骤。然而,现有下采样方法采用的小感受野机制,通常会导致采样特征存在全局性结构信息不足的局面。对此,提出了一种即插即用的双支路下采样方法(DPDM)。该方法采用附加大感受野采样支路的方式来改善主干网络对后期检测的支撑效果。在保留传统小感受野下采样操作的前提下,DPDM构建了一个兼顾效率的大感受野采样支路,来添加采样特征的结构性信息。该支路借鉴空间转深度操作,实现了常规小卷积核设置下的大感受野采样功能。双支路采样操作增加了采样多样性,但并未考虑两者之间的协同。因此,该方法随后采用通道拼接和逐点卷积技术,将两者进行了融合。以当前性能占据优势的YOLO系列模型为基准,在三个不同模型(YOLOX、YOLOv5、YOLOv6)及多个数据集上的实验对比,验证了该方法在改善检测精度上的效用。
The advantage of deep detection models primarily benefits from the feature representation ability of the backbone network,where down-sampling plays a key role in semantic integration.However,existing down-sampling approaches often ignore the global structural information of features,due to the usage of the small receptive field manner.To address this issue,this paper proposes a plug-and-play dual path down-sampling method(DPDM).It im-proves the support of backbone network for subsequent detection,through an extra large receptive field branch.Built on the traditional small receptive field channel,DPDM constructs an efficient large receptive field branch to obtain the structural information of features.Inspired from spatial-to-depth operation,it can achieve the effective-ness of a large receptive field under a conventional convolution kernel setting.The dual-path operation increases diversity of features but doesn’t emphasize the coordination between both types of features.Therefore,DPDM sub-sequently uses channel concatenation and point-wise convolution techniques to merge the features of two paths.Tak-ing the advanced YOLO as benchmark,experimental evaluations of three models(YOLOX,YOLOv5,YOLOv6)on different datasets demonstrate the effectiveness of this method in improving detection accuracy.
作者
顾正华
刘嘎琼
邵长斌
于化龙
GU Zhenghua;LIU Gaqiong;SHAO Changbin;YU Hualong(College of Computer,Jiangsu University of Science and Technology,Zhenjiang,Jiangsu 212100,China;Jiangsu Key Laboratory of Media Design and Software Technology(Jiangnan University),Wuxi,Jiangsu 214122,China)
出处
《计算机科学与探索》
CSCD
北大核心
2024年第10期2727-2737,共11页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金(62176107)。
关键词
深度学习
深度目标检测
多尺度目标检测
下采样策略
deep learning
deep object detection
multi-scale object detection
down-sampling strategy