摘要
针对室内场景图像语义分割结果不精确、显著图粗糙的问题,提出一种基于多模态特征优化提取和双路径引导解码的网络架构(feature regulator and dual-path guidance,FG-Net)。具体来说,设计的特征调节器对每个阶段的多模态特征依次进行噪声过滤、重加权表示、差异性互补和交互融合,通过强化RGB和深度特征聚合,优化特征提取过程中的多模态特征表示。然后,在解码阶段引入特征交互融合后丰富的跨模态线索,进一步发挥多模态特征的优势。结合双路径协同引导结构,在解码阶段融合多尺度、多层次的特征信息,从而输出更细致的显著图。实验在公开数据集NYUD-v2和SUN RGB-D上进行,在主要评价指标mIoU上达到48.5%,优于其他先进算法。结果表明,该算法实现了更精细的室内场景图像语义分割,表现出了较好的泛化性和鲁棒性。
Aiming at the problems of inaccurate semantic segmentation results and rough saliency maps of indoor scene images,this paper proposed a network architecture(feature regulator and dual-path guidance,FG-Net)based on multi-modal feature optimization extraction and dual-path guided decoding.Specifically,the feature regulator sequentially performed noise filtering,re-weighted representation,differential complementation and interactive fusion on the multi-modal features at each stage,and optimized multi-modal feature representation in the feature extraction process by strengthening RGB and depth feature aggregation.Then,the dual-path guidance component introduced rich cross-modal cues after feature interactive fusion in the decoding stage to further take advantage of multi-modal features.The dual-path cooperative guidance structure outputted a more detailed saliency map by integrating multi-scale and multi-level feature information in the decoding stage.This paper conducted experiments on the public datasets NYUD-v2 and SUN RGB-D,and achieved 48.5%in the main evaluation metric mIoU,which is better than other state-of-the-art algorithms.The results show that the algorithm achieves more refined semantic segmentation of indoor scene images,and has good generalization and robustness.
作者
张帅
雷景生
靳伍银
俞云祥
杨胜英
Zhang Shuai;Lei Jingsheng;Jin Wuyin;Yu Yunxiang;Yang Shengying(School of Information&Electronic Engineering,Zhejiang University of Science&Technology,Hangzhou 310023,China;College of Mechanical Electronical&Engineering,Lanzhou University of Technology,Lanzhou 730050,China;Zhejiang Dingli Industry Co.,Lishui Zhejiang 321400,China)
出处
《计算机应用研究》
CSCD
北大核心
2024年第5期1594-1600,共7页
Application Research of Computers
基金
国家自然科学基金资助项目(12062009)
新疆维吾尔自治区自然科学基金资助项目(2022D01C349)
基于标签技术的电网信息项目智能管理系统研究及应用(066700KK52180021)
浙江省基础公益研究计划资助项目(LGF19F020003)。
关键词
室内语义分割
特征调节器
双路径协同引导
RGB-D特征
indoor semantic segmentation
feature regulator
dual-path cooperative guidance
RGB-D features