期刊文献+

基于双向递归卷积神经网络的单目红外视频深度估计 被引量:11

Depth Estimation from Monocular Infrared Video Based on Bi-Recursive Convolutional Neural Network
原文传递
导出
摘要 考虑到红外视频的深度特征具有单帧图像的独特性和视频全局的连续性,在单目红外视频深度估计问题上提出一种基于双向递归卷积神经网络(BrCNN)的深度估计方法。BrCNN在卷积神经网络(CNN)能够提取单帧图像特征的基础之上引入循环神经网络(RNN)传递序列信息机制,使其既具有CNN良好的图像特征提取能力,能够自动提取视频中每一帧图像的局部特征,又具有RNN良好的序列特征提取能力,能够自动提取视频中每一帧图像所包含的序列信息,并向后递归传递这种信息。采用双向递归的视频序列信息传递机制来估计红外视频的深度,提取到的每一帧图像的特征都包含了视频前后文的序列信息。实验结果表明,相对于传统CNN提取单帧图像特征进行的估计,使用BrCNN能够提取更具有表达能力的特征,估计出更精确的深度。 For depth estimation from monocular infrared video, a method based on bi-recursive convolutional neural network (BrCNN) is proposed considering the uniqueness of a single frame and the continuity of the entire infrared video. BrCNN introduces the sequence information transfer mechanism of recurrent neural network (RNN) on the basis of the single frame feature extracted by the convolutional neural network (CNN). Thus, BrCNN possesses the feature extraction ability of CNN for a single image, which can automatically extract the local features of each frame in the infrared video, and the sequence information extraction ability of RNN, which can automatically extract the sequence information contained in each frame of the infrared video and recursiveIy transfer this information. By introducing the bi-recursive sequence information transfer mechanism to estimate the depth of monocular infrared video, features extracted from each image containing the context information. The experimental results show that BrCNN can extract more expressive features and estimate the depth from the infrared video more precisely than the traditional CNN, which estimate the depth by extracting the feature of a single frame.
出处 《光学学报》 EI CAS CSCD 北大核心 2017年第12期246-254,共9页 Acta Optica Sinica
基金 国家自然科学基金(61375007) 上海市科委基础研究项目(15JC1400600)
关键词 机器视觉 双向递归卷积 深度估计 单目红外视频 深度神经网络 machine vision bi-recursive convolution depth estimation monocular infrared video deep neural network
  • 相关文献

参考文献2

二级参考文献29

  • 1Daniel Scharstein, Richard Szeliski. A taxonomy and eval-uation of dense two-frame stereo correspondence algo-rithms [J]. International Journal of Computer Vision,2002,47(1):7 -42.
  • 2Horn B K B. Obtaining shape from shading information,chapter 4 in the psychology of computer vision [ M] . NewYork:McGraw Hill, 1975: 115 - 155.
  • 3Ashutosh Saxena, Andrew Ng, Sung Chung. Learningdepth from single monocular images[ C]. NIPS,2005 ,18 :44-58.
  • 4Derek Hoiem,Alexei A Efros,Martial Hebert. Automaticphoto pop-up [ J]. ACM Transactions on Graphics,2005,24(3) :577 -584.
  • 5古小婧.基于图像分析的自然彩色夜视成像方法研究[D].上海:东华大学,2011.
  • 6Hendrix Claudia,Barfield Woodrow. Relationship betweenmonocular and binocular depth cues for judgements ofspatial information and spatial instrument design[ J] . Dis-plays, 1995 ,16(3) :103-113.
  • 7Saxena A, Chung S H, Ng A Y. 3-D depth reconstruction from a single still image[J]. International Journal of Computer Vision, 2008, 76(1): 53-69.
  • 8Horn B K P. Obtaining shape from shading information[M]. New York: MIT Press, 1989: 123-171.
  • 9Saxena A, Chung S H, Ng A Y. Learning depth from single monocular images [C]. Advances in Neural Information Processing Systems, 2005: 1161-1168.
  • 10Saxena A, Sun M, Ng A Y. Make 3D: Learning 3D scene structure from a single still image[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(5): 824-840.

共引文献35

同被引文献45

引证文献11

二级引证文献60

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部