摘要
单目深度估计作为计算机视觉的基本问题,得到人们的广泛关注。目前的方法多集中在深度卷积神经网络的图像级信息上,训练时收敛速度较慢,精度下降,特别是在图像中拥有不同大小的多目标情况下。为此,本文基于一个编解码框架提出了一个新的卷积神经网络模型结构DCDN(Deep Convolution DenseASPP Network),并将其应用到深度估计中。不同尺度的物体特征需要不同的卷积核去获取,对于多目标的图像,用不同的卷积核去获取他们的特性。本文采用稠密链接的空洞卷积组,利用不同扩张率的空洞卷积去强化多尺度目标的特性学习。实验结果表明,该方法在NYU-Depth-v2数据集上达到了0.823的准确率(阈值<1.25),优于最先进的方法。
As a basic problem of computer vision,monocular depth estimation has been widely concerned.At present,most methods focus on the image-level information of deep convolutional neural network,and the convergence speed is slow and the accuracy drops,especially in the case of multi-objects with different sizes in the image.For this reason,we propose a new DCDN(Deep Convolution DenseASPP Network)model structure based on a codec framework and apply it to depth estimation.We believe that different convolution kernels are needed to obtain the features of objects of different scales.For some multi-object images,different convolution kernels should be used to obtain their characteristics.In this paper,dense linked dilated convolution groups are used to enhance the characteristic learning of multi-scale targets by using the dilated convolution with different dilation rates.The experimental results show that our method achieves the accuracy of 0.823(threshold<1.25)on NYU-Depth-V2 data set,which is better than the most advanced method.
作者
张顺然
吴克伟
洪炎
ZHANG Shunran;WU Kewei;HONG Yan(School of Computer Science and Information Engineering,Hefei University of Technology,Hefei 230601,China)
出处
《智能计算机与应用》
2020年第6期42-47,50,共7页
Intelligent Computer and Applications
关键词
深度估计
卷积神经网络
空洞卷积
多尺度
Depth Prediction
Convolutional Neural Network
Dilated Convolution
Multi-scale