SIS:一种新的多尺度卷积算子

SIS: A new multi-scale convolutional operator

下载PDF

导出

摘要具有泛化能力的视觉特征对于计算机视觉任务来说是至关重要的。基于深度神经网络的方法采用逐层叠加特征的形式获取多尺度特征图,导致计算开销显著增加。为解决这一问题,通过在标准卷积算子中部署渐进式多尺度架构,提出一种轻量和高效的尺度嵌套卷积算子(scale-in-scale,SIS)。具体来说,设计了一种变换—分离—对抗机制来优化常规的通道计算,减轻了计算成本,同时在单一卷积层内扩大了感受野。同时,引入权重共享与特征拆分交互运算,并结合特征递归和融合机制,使所提出SIS算子能够与其他卷积算子结合,例如经典的Res Net和Res2Net架构。我们将SIS算子部署到第29层、50层和101层的Res Net和Res2Net变体中,并在CIFAR、PASCAL VOC和COCO2017等公开基准数据集上评估这些修改后的模型。实验结果表明,所提出的方法在图像分类、关键点估计、语义分割和物体检测等计算机视觉任务上的性能均优于同时期最先进的方法。 Visual features with high potential for generalization are critical for computer vision applications. In addition to the computational overhead associated with layer-by-layer feature stacking to produce multi-scale feature maps, existing approaches also incur high computational costs. To address this issue, we present a compact and efficient scale-in-scale convolution operator called SIS by incorporating an efficient progressive multi-scale architecture into a standard convolution operator. More precisely, the suggested operator uses the channel transform-divide-and-conquer technique to optimize conventional channel-wise computing, thereby lowering the computational cost while simultaneously expanding the receptive fields within a single convolution layer. Moreover, the proposed SIS operator incorporates weight-sharing with split-and-interact and recur-and-fuse mechanisms for enhanced variant design. The suggested SIS series is easily pluggable into any promising convolutional backbone, such as the well-known ResNet and Res2 Net. Furthermore, we incorporated the proposed SIS operator series into 29-layer, 50-layer, and 101-layer ResNet as well as Res2 Net variants and evaluated these modified models on the widely used CIFAR, PASCAL VOC, and COCO2017 benchmark datasets, where they consistently outperformed state-of-the-art models on a variety of major vision tasks, including image classification,key point estimation, semantic segmentation, and object detection.

作者周满傅雪阳刘爱萍 Man Zhou;Xueyang Fu;Aiping Liu(School of Information Science and Tecnology,University of Science and Technology of China,Hefei 230027 China)

机构地区中国科学技术大学信息科学技术学院

出处《中国科学技术大学学报》 CAS CSCD 北大核心 2022年第4期56-65,I0003,共11页 JUSTC

基金 supported in part by the USTC Research Funds of the Double First-Class Initiative (YD2100002003,Y D2100002004)。

关键词多尺度卷积算子图像分类关键点估计语义分割物体检测 multi-scale convolutional operator image classification key point estimation semantic segmentation object detection

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1邓世杰,王海晏,王孟爱,方诚喆.基于位置向量统计的光谱匹配算子[J].激光与光电子学进展,2021,58(16):259-268. 被引量：1
2张迅,李建胜,欧阳文,陈润泽,汲振,郑凯.融合运动信息和跟踪评价的高效卷积算子[J].浙江大学学报（工学版）,2022,56(6):1135-1143.
3王建军,卢云鹏,张荠匀,白崇岳,胡燕威,李旭辉,王炯宇.实现激光点云高效配准的ICP优化及性能验证[J].红外与激光工程,2021,50(10):301-307. 被引量：12
4盛江明,薛娟,李鹏,伊娜.基于时空图卷积神经网络的蛋白质复合物识别方法[J].南方医科大学学报,2022,42(7):1075-1081. 被引量：1
5曹家乐,李亚利,孙汉卿,谢今,黄凯奇,庞彦伟.基于深度学习的视觉目标检测技术综述[J].中国图象图形学报,2022,27(6):1697-1722. 被引量：64
6尹路珈,张一鸣,李东升,李慧霸,孟祥飞,宋振龙,李佳鑫,无.虚拟数据存储与计算[J].中国科技成果,2022,23(11).
7杨利元,宋徽,明庆忠.安徽省大运河文化带非遗与旅游融合发展路径[J].南京晓庄学院学报,2022,38(3):98-103. 被引量：2
8周培培,侯幸林.一种用于图像融合的无监督深度神经网络[J].系统仿真学报,2022,34(6):1267-1274. 被引量：2
9余力,李慧媛,焦晨璐,冷友方,徐冠宇.基于多头注意力对抗机制的复杂场景行人轨迹预测[J].计算机学报,2022,45(6):1133-1146. 被引量：3
10姜雨彤,杨忠琳,朱梦琪,张一,郭黎霞.适应性双通道先验的图像去雾方法[J].光学精密工程,2022,30(10):1246-1262. 被引量：4

中国科学技术大学学报

2022年第4期

浏览历史

内容加载中请稍等...

SIS:一种新的多尺度卷积算子

相关作者

相关机构

相关主题

浏览历史