摘要
室内空间布局估计作为当下计算机视觉领域的研究之一,在目标检测、增强现实和机器人导航等任务中发挥着重要的作用。为更加有效地感知室内场景的布局关系,提出了一种基于多任务监督学习的室内空间布局估计方法,端到端地提取出室内场景的空间分割图。针对室内图像的分割特点,设计编码器-解码器的网络结构,并引入多任务监督学习,从而推理出室内空间布局和各区域的语义边缘结果;定义联合损失函数,在模型训练过程中不断优化分割效果;为更好地表达出各区域之间的布局关系,通过各区域的边缘预测结果,对网络模型的输出进行局部精细化处理,以推理出室内场景空间的最终布局。在公共数据集LSUN和Hedau上进行实验,所提方法能够有效地优化室内空间布局估计效果,分别获得7.54%和7.08%的像素误差,总体上优于对比方法。
Indoor spatial layout estimation is currently one of the research hotspots in the computer vision field.It plays a crucial role in object detection,augmented reality,and robot navigation.This paper proposed an indoor spatial layout estimation method based on multi-task supervised learning to efficiently perceive the layout relationship of indoor scenes.This method could extract the spatial segmentation map of indoor scenes in an end-to-end manner.According to the segmentation characteristics of indoor layout images,an encoder-decoder network structure was designed,and multi-task supervised learning was introduced to obtain the indoor spatial layout and the semantic edge results of each region.The joint loss function was defined to continuously optimize the segmentation effect during the model training.In order to better express the layout relationship between regions,the edge prediction results of each region were used to locally refine the output of the network model,so as to infer the final spatial layout of indoor scenes.Experiments on the public datasets LSUN and Hedau show that the proposed method can effectively optimize the effect of indoor spatial layout estimation and obtain 7.54%and 7.08%pixel errors respectively,which is better than the comparison method in general.
作者
邹一波
李涛
陈明
葛艳
赵林林
ZOU Yibo;LI Tao;CHEN Ming;GE Yan;ZHAO Linlin(College of Information Technology,Shanghai Ocean University,Shanghai 201306,China;Key Laboratory of Fisheries Information,Ministry of Agriculture and Rural Affairs,Shanghai 201306,China)
出处
《北京航空航天大学学报》
EI
CAS
CSCD
北大核心
2024年第11期3327-3337,共11页
Journal of Beijing University of Aeronautics and Astronautics
基金
上海市科技创新计划(20dz1203800)。
关键词
布局估计
室内场景
多任务监督学习
端到端
语义边缘
layout estimation
indoor scene
multi-task supervised learning
end-to-end
semantic edge