期刊文献+

跨模态融合和边界可变形卷积引导的RGB-D显著性目标检测

RGB-D Salient Object Detection Based on Cross-Modal Fusion and Boundary Deformable Convolution Guidance
下载PDF
导出
摘要 RGB-Depth(RGB-D)显著性目标检测是一项有意义且具有挑战性的任务,基于现有卷积神经网络检测方法在简单场景中获得了良好的检测性能,但不能有效应对背景信息混乱,深度图质量低和目标轮廓复杂的情况.为应对上述问题,本文提出了一种跨模态融合和边界可变形卷积引导的RGB-D显著性目标检测方法.首先,本文以Swin-Transformer为特征提取器,分别对RGB模态与深度图模态进行特征提取,并通过跨模态注意力增强特征模块对两种模态特征进行融合以挖掘显著物的共性与互补特征.接着将提出的相邻多尺度特征增强模块嵌入编码器深层,以获得丰富的全局上下文特征信息,更精准地定位显著物的位置.然后通过构建一个边界特征提取解码器(U-Net架构)生成显著物的边界线索图,并重复采用跨模态融合特征确保生成显著物边界的完整性.最后,本文设计了一个边界可变形卷积引导模块,使用边界线索图与可变形卷积引导跨模态融合特征进行解码以得到更加准确的显著图.通过在6个公开基准数据集上与25种主流方法相比较,本文所提模型在多个指标上均有较明显的提升,从而证明了本文方法的有效性. RGB-Depth(RGB-D)salient object detection is a meaningful and challenging task.The current method based on convolutional neural networks has achieved good detection performance in simple scenes,but cannot effectively handle scenes with cluttered background information,low-quality depth maps,and complex object contours.In order to solve the above problems,an RGB-D SOD model based on cross-modal fusion and boundary deformable convolution guidance is proposed in this paper.Firstly,the Swin Transformer is used as an extractor to extract features from the RGB modality and depth modality,respectively,which fuse the two modalities by using a cross-modal attention enhancement feature(CMAEF)module,to explore the common and complementary features of salient objects.Then,the proposed adjacent multi-scale feature enhancement(AMFE)module is embedded deep-level into the encoder to obtain rich global contextual feature information,which can locate the position of salient objects more accurately.Next,the boundary cue maps of salient objects are generated by boundary feature extraction decoder(U-Net architecture)constructed and repeated using crossmodal fusion features to ensure the integrity of the generated salient object boundaries.Finally,we designed a boundary deformable convolution guidance(BDCG)module that uses boundary cue maps with deformable convolution to guide the decoding of cross-modal fusion features to obtain more accurate saliency maps.Comprehensive experiments on six popular benchmark datasets compared with 25 mainstream methods demonstrate that the proposed model shows significant improvement in metrics,which proves the effectiveness of the proposed model.
作者 孟令兵 袁梦雅 时雪涵 张乐 吴锦华 程菲 MENG Ling-bing;YUAN Meng-ya;SHI Xue-han;ZHANG Le;WU Jin-hua;CHENG Fei(School of Computer and Software Engineering,Anhui Institute of Information Technology,Wuhu,Anhui 241000,China;School of Electrical and Electronic Engineering,Anhui Institute of Information Technology,Wuhu,Anhui 241000,China;School of Management,Hangzhou Dianzi University,Hangzhou,Zhejiang 310000,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2023年第11期3155-3166,共12页 Acta Electronica Sinica
基金 安徽省自然科学基金(No.2008085MF201) 安徽省教育厅自然科学重点项目(No.2022AH051894,No.2022AH051887) 安徽省高校优秀青年人才支持计划(No.gxyq2022147)。
关键词 显著性目标检测 跨模态融合 边界特征 可变形卷积 显著图 salient object detection cross-modal fusion boundary features deformable convolution saliency map
  • 相关文献

参考文献2

二级参考文献6

共引文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部