摘要
针对复杂室内场景中,现有RGB图像语义分割网络易受颜色、光照等因素影响以及RGB-D图像语义分割网络难以有效融合双模态特征等问题,提出一种基于注意力机制的RGB-D双模态特征融合语义分割网络AMBFNet(attention mechanism bimodal fusion network)。该网络采用编-解码器结构,首先搭建双模态特征融合结构(AMBF)来合理分配编码支路各阶段特征的位置与通道信息,然后设计双注意感知的上下文(DA-context)模块以合并上下文信息,最后通过解码器将多尺度特征图进行跨层融合,以减少预测结果中类间误识别和小尺度目标丢失问题。在SUN RGB-DNYU和NYU Depth v2(NYUDV2)两个公开数据集上的测试结果表明,相较于残差编解码(RedNet)、注意力互补网络(ACNet)、高效场景分析网络(ESANet)等目前较先进的RGB-D语义分割网络,在同等硬件条件下,该网络具有更好的分割性能,平均交并比(MIoU)分别达到了47.9%和50.0%。
The existing RGB image semantic segmentation network for complex indoor scenes is susceptible to factors such as color and lighting,while it is also challenging to integrate dual-modal features effectively.Regarding the issue indicated above,this paper proposes an attention mechanism bimodal fusion network(AMBFNet)that adopts an encoderdecoder structure.In the first phase,building the bimodal fusion network structure(AMBF)is carried out to reasonably allocate the location and channel information of the features at each stage of the encoding branch.And then,designing the DA-context module is implemented to merge the context information.Finally,the multi-scale feature maps are cross-layer fused through the decoder to reduce the problem of misrecognition between classes and the loss of small-scale targets in the prediction results.The test results on the two public datasets of SUN RGB-DNYU and Depth v2(NYUDV2)show the consequence that compared with the more advanced RGB-D semantic segmentation network such as the RedNet,ACNet and ESANet,under the same hardware conditions,the network proposed in this paper has better segmentation performance.At the same time,the MIoU reaches 47.9%and 50.0%,respectively.
作者
罗盆琳
方艳红
李鑫
李雪
LUO Penlin;FANG Yanhong;LI Xin;LI Xue(School of Information Engineering,Southwest University of Science and Technology,Mianyang,Sichuan 621010,China;Robot Technology Used for Special Environment Key Laboratory of Sichuan Province,Southwest University of Science and Technology,Mianyang,Sichuan 621010,China)
出处
《计算机工程与应用》
CSCD
北大核心
2023年第7期222-231,共10页
Computer Engineering and Applications
基金
国家重点实验室开放基金(SKLA20200203)。