期刊文献+

基于双交叉注意力Transformer网络的小样本图像语义分割

Dual cross-attention Transformer network for few-shot image semantic segmentation
下载PDF
导出
摘要 小样本图像语义分割只用少量样本就能分割出新类别。针对现有方法中语义信息挖掘不充分的问题,本文提出一种基于双交叉注意力网络的小样本图像语义分割方法。该方法采用Transformer结构,利用双交叉注意力模块同时从通道和空间维度上学习多尺度查询特征和支持特征的远程依赖性。首先,本文提出通道交叉注意力模块,并结合位置交叉注意力模块构成双交叉注意力模块。其中,通道交叉注意力模块用于学习查询和支持特征之间的通道语义相互关系,位置交叉注意力模块用来捕获查询和支持特征之间的远程上下文相关性。然后,通过多个双交叉注意力模块能够为查询图像提供包含丰富语义信息的多尺度交互特征。最后,本文引入辅助监督损失,并通过上采样和残差连接将多尺度交互特征连接至解码器以得到准确的新类分割结果。本文方法在数据集PASCAL-5i上的mIoU达到了69.9%(1-shot)和72.4%(5-shot),在数据集COCO-20i上的mIoU达到了48.9%(1-shot)和54.6%(5-shot)。与主流方法相比,本文方法的分割性能达到了最先进的水平。 Few-shot semantic segmentation can segment novel classes with only few examples.To address the problem of insufficient semantic information mining in existing methods,a method based on Dual Cross-Attention Network for few-shot image semantic segmentation is proposed.The method adopts Transformer structure and uses dual cross-attention modules to explore the remote dependencies between multi-scale query and support features from both channel and spatial dimensions.Firstly,a channel cross-attention module is proposed in combination with the position cross-attention module to form a dual cross-attention module.Wherein,the channel cross-attention module is applied to learn the channel semantic interrelationships between the query and support features.The position cross-attention module is used to capture the remote contextual correlations between the query and support features.Then,multi-scale interaction features containing rich semantic information can be provided to the query image by multiple dual cross-attention modules.Finally,to obtain accurate segmentation results,auxiliary supervision loss is introduced,and these multi-scale interaction features are connected to the decoder via upsampled and residual connection.The proposed method achieves 69.9%(1-shot)and 72.4%(5-shot)mIoU on the dataset PASCAL-5i,and 48.9%(1-shot)and 54.6%(5-shot)mIoU on the dataset COCO-20i,which attains the state-of-the-art segmentation performance in comparison with mainstream methods.
作者 刘玉 郭迎春 朱叶 于明 LIU Yu;GUO Yingchun;ZHU Ye;YU Ming(School of Electronic and Information Engineering,Hebei University of Technology,Tianjin 300401,China;School of Artificial Intelligence and Data Science,Hebei University of Technology,Tianjin 300401,China)
出处 《液晶与显示》 CAS CSCD 北大核心 2024年第11期1494-1505,共12页 Chinese Journal of Liquid Crystals and Displays
基金 国家自然科学基金青年项目(No.62102129) 国家自然科学基金面上项目(No.62276088) 河北省自然科学基金(No.F2021202030,No.F2019202381,No.F2019202464)。
关键词 小样本图像语义分割 Transformer结构 通道交叉注意力 双交叉注意力 辅助损失 few-shot semantic segmentation transformer architecture channel cross-attention dual cross-attention auxiliary losses
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部