摘要
单目图像深度估计通过唯一视角下的图像来感知每个像素的空间位置关系,对于场景理解、三维重建等具有重要意义。为了全面提升预测深度图涵盖的信息量,保持关键细节不丢失,论文基于对称的编解码结构设计了一个全卷积网络来执行深度估计任务,称为ResUNet。该网络继承了U-Net模型的经典架构,首先采用了改进的ResNet网络来实现特征编码,其次保留了U-Net模型的解码器设计来将特征图解码为深度图,这种结构设计融合了ResNet和U-Net网络的特点,通过协同优化最大程度发挥了各自的优势,能够在进行深度估计的过程中实现空间结构和细节信息的最大程度保留,进而提升预测深度图的真实性与可靠性。基于该网络进一步提出了ResDepth算法,该算法针对深度图预测过程中容易产生物体失真、细节淹没的问题,从损失函数的角度出发设计了一个联合损失函数,在不带来额外计算开销的情况下全面提升了预测深度图的质量。最后,在NYU-Depth V2、SUN RGB-D及KITTI三个公开数据集上进行对比实验来评估算法性能,实验表明,论文提出的ResDepth算法及联合损失函数能够更好地保留空间结构信息及几何细节信息,进而提升深度估计结果的准确性。
Monocular image depth estimation perceives the spatial position relationship of each pixel through the image from a unique viewing angle,which is of great significance for scene understanding and 3D reconstruction.In order to comprehensively im-prove the amount of information covered by the prediction depth map and keep the key details from being lost,this paper designs a fully convolutional network based on the symmetric codec structure to perform the depth estimation task,which is called ResUNet.The network inherits the classical architecture of the U-Net model,firstly the improved ResNet network is used to realize the feature encoding,and secondly,the decoder design of the U-Net model is retained to decode the feature map into a depth map,which inte-grates the characteristics of ResNet and U-Net network,and maximizes their respective advantages through collaborative optimiza-tion,which can realize the maximum retention of spatial structure and detail information in the process of depth estimation,and then improve the authenticity and reliability of the prediction depth map.Based on the network,the ResDepth algorithm is further proposed,which is designed from the perspective of loss function to comprehensively improve the quality of the predicted depth map without bringing additional computational overhead.Finally,comparative experiments are carried out on three public datasets,which are NYU-Depth V2,SUN RGB-D and KITTI to evaluate the performance of the algorithm,and the experiments show that the ResDepth algorithm and the joint loss function proposed in this paper can better retain the spatial structure information and geo-metric detail information,and then improve the accuracy of the depth estimation results.
作者
江忠泽
陈忠
徐雪茹
吴亮
JIANG Zhongze;CHEN Zhong;XU Xueru;WU Liang(School of Artificial Intelligence and Automation,Huazhong University of Science and Technology,Wuhan 430074;School of Foreign Languages,Huazhong University of Science and Technology,Wuhan 430074)
出处
《计算机与数字工程》
2024年第5期1488-1494,共7页
Computer & Digital Engineering
基金
民用航天十三五预先研究项目(编号:D040401-w05)
国产卫星应急观测与信息支持关键技术(编号:B0302)资助。
关键词
单目深度估计
全卷积网络
空洞卷积
联合损失函数
monocular depth estimation
fully convolutional network
dilated convolution
joint loss function