摘要
特征金字塔广泛应用于基于多尺度特征学习的图像理解任务中,最新多尺度特征学习侧重于特征在语义特征和细节特征的交互融合,特征金字塔通过相邻层特征插值和求和来补充多尺度信息语义特征和细节特征,由于非线性运算的存在和不同输出维数的卷积层,不同能级之间关系复杂,逐像素求和并不是最有效的方法。因此,提出了基于语义一致性监督金字塔网络的目标检测方法。该网络模型由多语义特征增强模块和非对称卷积侧接模块组成,其中非对称卷积侧接模块通过学习不同感受野的特征图,提升特征对各种姿态目标泛化性,多语义特征增强模块通过为高层特征图补全底层信息,提升高层特征的细节表达能力,同时在准确性和检测性能之间实现更好的权衡。在基准测试集MSCOCO上进行的实验结果表明,所提出的目标检测方法在不增加FLOPs的基础上,将检测平均精确度提高了2.6%,显著提高了目标检测的性能。
Feature pyramid network is widely used in image understanding tasks based on multi-scale feature learning.The latest multi-scale feature learning focuses on the interactive integration of features in semantic features and detail features.Feature pyramid network complements multi-scale information semantic features and detail features through feature interpolation and summation of adjacent layers.Due to the existence of nonlinear operation and convolution layers with different output dimensions,the relationship among different levels is much more complex,and pixel by pixel summation is suboptimal method.A supervised feature pyramid network based on semantic consistency for object detection is proposed.The present method is composed of asymmetric convolution lateral connection and multi-scale semantic features augmentation.The asymmetric convolution lateral connection improves the generalization of features to various pose objects by learning the feature maps of different receptive fields.The multi-scale semantic features augmentation network improves the detail expression ability of high-level features by supplementing the low-level information for the high-level feature map.Moreover,the present method can provide a better trade-off between accuracy and detection performance.Experiments conduct on the MSCOCO dataset,and the results show that the proposed object detection method's accuracy is improved by 2.6% without increasing extra FLOPs.
作者
代睿
徐鹏越
李洁
何立火
DAI Rui;XU Pengyue;LI Jie;HE Lihuo(School of Electronic Engineering,Xidian University,Xi'an 710071,China)
出处
《西北工业大学学报》
EI
CAS
CSCD
北大核心
2024年第5期959-968,共10页
Journal of Northwestern Polytechnical University
基金
国家自然科学基金(U21A20514,62276203,62036007)
陕西省重点产业创新链(2022ZDLGY01-14)资助。
关键词
目标检测
语义一致性
特征金字塔网络
object detection
semantic consistency
feature pyramid network