摘要
针对RGB-D显著目标检测问题,提出空间约束下自相互注意力的RGB-D显著目标检测方法.首先,引入空间约束自相互注意力模块,利用多模态特征的互补性,学习具有空间上下文感知的多模态特征表示,同时计算两种模态查询位置与周围区域的成对关系以集成自注意力和相互注意力,进而聚合两个模态的上下文特征.然后,为了获得更互补的信息,进一步将金字塔结构应用在一组空间约束自相互注意力模块中,适应不同空间约束下感受野不同的特征,学习到局部和全局的特征表示.最后,将多模态融合模块嵌入双分支编码-解码网络中,解决RGB-D显著目标检测问题.在4个公开数据集上的实验表明,文中方法在RGB-D显著目标检测任务上具有较强的竞争性.
Aiming at the problem of RGB-D salient object detection,a RGB-D salient object detection method is proposed based on pyramid spatial constrained self-mutual attention.Firstly,a spatial constrained self-mutual attention module is introduced to learn multi-modal feature representations with spatial context awareness by the complementarity of multi-modal features.Meanwhile,the pairwise relationships between the query positions and surrounding areas are calculated to integrate self-attention and mutual attention,and thus the contextual features of the two modalities are aggregated.Then,to obtain more complementary information,the pyramid structure is applied to a set of spatial constrained self-mutual attention modules to adapt to different features of the receptive field under different spatial constraints and learn local and global feature representations.Finally,the multi-modal fusion module is embedded into a two-branch encoder-decoder network model,and the RGB-D salient object detection task is solved.Experiments on four benchmark datasets show strong competitiveness of the proposed me thod in RGB-D salient object detection.
作者
袁晓
肖云
江波
汤进
YUAN Xiao;XIAO Yun;JIANG Bo;TANG Jin(Anhui Provincial Key Laboratory of Multimodal Cognitive Computation,School of Computer Science and Technology,Anhui University,Hefei 230601;School of Artificial Intelligence,Anhui University,Hefei 230601;Institute of Artificial Intelligence,Hefei Comprehensive National Science Center,Hefei 230088)
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2022年第6期526-535,共10页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.62076004,62006002)
安徽省自然科学基金青年项目(No.1908085QF264)
安徽高校协同创新项目(No.GXXT-2020-013)资助。