摘要
考虑到红外视频的深度特征具有单帧图像的独特性和视频全局的连续性,在单目红外视频深度估计问题上提出一种基于双向递归卷积神经网络(BrCNN)的深度估计方法。BrCNN在卷积神经网络(CNN)能够提取单帧图像特征的基础之上引入循环神经网络(RNN)传递序列信息机制,使其既具有CNN良好的图像特征提取能力,能够自动提取视频中每一帧图像的局部特征,又具有RNN良好的序列特征提取能力,能够自动提取视频中每一帧图像所包含的序列信息,并向后递归传递这种信息。采用双向递归的视频序列信息传递机制来估计红外视频的深度,提取到的每一帧图像的特征都包含了视频前后文的序列信息。实验结果表明,相对于传统CNN提取单帧图像特征进行的估计,使用BrCNN能够提取更具有表达能力的特征,估计出更精确的深度。
For depth estimation from monocular infrared video, a method based on bi-recursive convolutional neural network (BrCNN) is proposed considering the uniqueness of a single frame and the continuity of the entire infrared video. BrCNN introduces the sequence information transfer mechanism of recurrent neural network (RNN) on the basis of the single frame feature extracted by the convolutional neural network (CNN). Thus, BrCNN possesses the feature extraction ability of CNN for a single image, which can automatically extract the local features of each frame in the infrared video, and the sequence information extraction ability of RNN, which can automatically extract the sequence information contained in each frame of the infrared video and recursiveIy transfer this information. By introducing the bi-recursive sequence information transfer mechanism to estimate the depth of monocular infrared video, features extracted from each image containing the context information. The experimental results show that BrCNN can extract more expressive features and estimate the depth from the infrared video more precisely than the traditional CNN, which estimate the depth by extracting the feature of a single frame.
出处
《光学学报》
EI
CAS
CSCD
北大核心
2017年第12期246-254,共9页
Acta Optica Sinica
基金
国家自然科学基金(61375007)
上海市科委基础研究项目(15JC1400600)
关键词
机器视觉
双向递归卷积
深度估计
单目红外视频
深度神经网络
machine vision
bi-recursive convolution
depth estimation
monocular infrared video
deep neural network