摘要
光场相机可以仅在一次拍摄中记录场景的空间和角度信息,所生成的图像与传统二维图像相比包含了更多的信息,在深度估计任务方面更具有优势。为了利用光场图像获取高质量的场景深度,基于其多视角的表征方式,提出了一种具有多通道信息高效融合结构的特征融合网络。在人为选择特定视角的基础上,使用不同尺寸卷积核来应对不同的基线变化;同时针对光场数据的多路输入特点搭建了特征融合模块,并利用双通道的网络结构整合神经网络的前后层信息,提升网络的学习效率并减少信息损失。在new HCI数据集上的实验结果显示,该网络在训练集上的收敛速度较快,可以在非朗伯场景中实现精确的深度估计,并且在MSE指标的平均值表现上要优于所对比的其他先进的方法。
Compared with the traditional two-dimensional images,the images,generated by the light field camera capturing the spatial and angular information of the scene in only one shot,contain more information and exhibit more advantages in the depth estimation task.In order to obtain high-quality scene depth using light field images,a feature assigning network,of which the structure can efficiently fuse the multi-channel information,was designed for depth estimation based on its multi-angle representation.On the basis of the artificial selection of specific views,convolution kernels of different sizes were utilized to cope with different baseline changes.Meanwhile,a feature fusion module was established based on the multi-input characteristics of light field data,and the double-channel network structure was used to integrate the front and back layer information,boosting the learning efficiency and performance of the network.Experimental results on the new HCI data set show that the network converges faster on the training set and can achieve accurate depth estimation in non-Lambertian scenes,and that the average performance on the MSE indicator is superior to other advanced methods.
作者
何也
张旭东
吴迪
HE Ye;ZHANG Xu-dong;WU Di(School of Computer and Information,Hefei University of Technology,Hefei Anhui 230009,China)
出处
《图学学报》
CSCD
北大核心
2020年第6期922-929,共8页
Journal of Graphics
基金
国家自然科学基金面上项目(61876057,61971177)。
关键词
光场
深度估计
卷积神经网络
特征融合
注意力
多视角
light field
depth estimation
convolutional neural network
feature fusion
attention
multi-view