摘要
针对Single-Shot Detection的特征金字塔中生成的浅层特征语义信息不足,导致小目标检测性能较差的问题,提出了一种基于残差学习与循环注意力的SSD目标检测算法。首先主干网络采用学习能力更强的Resnet101来提取有效的特征信息;然后通过构建轻量级的单向特征融合块对原特征金字塔中的深特征层与浅特征层特征进行融合,并生成新的特征金字塔,进而丰富用于预测的有效特征层的语义信息;最后提出一种新的空间池化策略,并与残差网络中的跳跃连接相结合构成循环注意力模块,从而引入全局的上下文信息,为局部特征建立全局信息关联。为了解决难易样本数量不平衡的问题,将Focalloss作为回归损失函数。实验结果表明,在PASCAL VOC公共数据集上,该算法的平均检测精度(mAP)为79.7%,较SSD提高了2.5%。在MS COCO公共数据集上的mAP为30.0%,较SSD提高了4.9%。
To address the problem that the shallow feature semantic information generated in the feature pyramid of Single-Shot Detection is insufficient,resulting in poor performance of small object detection,an SSD object detection algorithm based on resi-dual learning with cyclic attention is proposed.Firstly,the backbone network uses Resnet101,which is more capable of learning,to extract valid feature information.The deep feature layer of the original feature pyramid is then fused with the shallow feature layer by constructing a lightweight one-way feature fusion block,and a new feature pyramid is generated,which in turn enriches the semantic information of the effective feature layer used for prediction.Finally,a new spatial pooling strategy is proposed and combined with jump connections in residual networks to form a cyclic attention module to introduce global contextual information and establish full image dependencies for local features.To address the imbalance in the number of difficult and easy samples,Focalloss is used as the regression loss function.Experimental results show that the average detection accuracy(mAP)of the algorithm is 79.7%on the PASCAL VOC public dataset,an improvement of 2.5%over SSD.The mAP on the MS COCO public dataset is 30.0%,an improvement of 4.9%over SSD.
作者
贾天豪
彭力
JIA Tianhao;PENG Li(Engineering Research Center of Internet of Things Technology Applications,School of IoT Engineering,Jiangnan University,Wuxi,Jiangsu 214122,China)
出处
《计算机科学》
CSCD
北大核心
2023年第5期170-176,共7页
Computer Science
基金
国家自然科学基金(61873112,61802107)
台州市发改委基金项目(2106-331000-04-04-295510)。
关键词
目标检测
残差学习
深度学习
注意力机制
特征融合
Object detection
Residual learning
Deep learning
Attention mechanism
Feature fusion