期刊文献+

基于双目估计的动态场景三维感知技术研究与实现

Research and Implementation of Dynamic Scene 3D Perception Technology Based on Binocular Estimation
下载PDF
导出
摘要 双目立体视觉技术在计算机视觉领域研究中一直具有重要意义。不同于单目或多目技术,双目立体视觉在能够准确获取图像深度的同时,也兼具了低成本、高泛用性、使用简便等优势。基于双目视觉的三维感知技术能够极大提升计算机对现实世界的理解和交互能力,进一步增强计算机视觉技术在复杂、多变的场景中的适应能力,在自动驾驶、机器人导航、工业检测、航天等领域发挥着重要作用。文中重点研究动态场景中的三维重建与目标感知技术,在大多数情况中,视野中的动态目标实际上是需要重点关注的目标,而静态目标,特别是在场景中绝大多数时候都占据主要空间的背景以及静态物体往往是可以被忽略掉的,但是在实际计算时确占用了大量资源。在场景中不受关注的目标上花费过多计算资源,显然是无意义且非常低效的。针对这个问题,本文在深入研究了目前主流的双目立体匹配方法、图像分割等方法的基础上,提出了一种基于双目估计的动态场景三维感知技术。主要的创新点和研究成果包括:针对传统双目立体匹配算法逐像素计算聚合低价效率低下的问题,提出了一种基于二维场景实例分割的双目立体匹配方法,使用mask分割后的目标图像进行立体匹配,这样不仅提升了匹配性能,同时也降低了动态目标的匹配难度。针对分割精确不足的问题,引入基于RGB图像的mask边缘滤波优化方法,在提升效率的同时提升视场点云重建精度。其次,基于双目估计深度学习网络进行实时目标点云生产,并提出基于GPU加速的邻近帧点云的实时动态目标感知算法。最后提出二三维一体的动态目标实时感知技术,在对目标场景实现实时三维重建的同时,快速识别检测环境中的动态目标物体。 Binocular stereo vision technology has always been of great significance in the field of computer vision research.Unlike monocular or multicular technology,binocular stereo vision has the advantages of low cost,high versatility,simple use and so on while it can accurately obtain the image depth.The three-dimensional perception technology based on binocular vision can greatly improve the computer’s understanding and interaction ability to the real world,further enhance the adaptability of computer vision technology in complex and changeable scenes,and play an important role in the fields of automatic driving,robot navigation,industrial inspection,aerospace,etc.This paper focuses on 3D reconstruction and object perception technology in dynamic scenes.In most cases,dynamic objects in the field of vision usually need to be focused on,while static objects,especially the background and static objects in the scene that occupy the main space in most cases,can be ignored,but they do occupy a lot of resources in the actual calculation,It is obviously meaningless and inefficient to spend too much computing resources on targets that are not concerned in the scene.In order to solve this problem,based on the in-depth study of the current mainstream binocular stereo matching methods,image segmentation and other methods,this paper proposes a dynamic scene 3D perception technology based on binocular estimation.The main innovations and research achievements include:Aiming at the low cost and efficiency of the traditional binocular stereo matching algorithm in pixel by pixel computing aggregation,a binocular stereo matching method based on two-dimensional scene instance segmentation is proposed,and the target image after mask segmentation is used for stereo matching,which not only improves the matching performance but also reduces the difficulty of dynamic target matching.At the same time,in order to solve the problem of insufficient segmentation accuracy,the mask edge filtering optimization method based on rgb image is introduced to improve the efficiency and the reconstruction accuracy of the field of view point cloud.Secondly,real-time target point cloud production is carried out based on binocular estimation depth learning network,and a real-time dynamic target perception algorithm based on GPU accelerated neighboring frame point cloud is proposed.At last,a two-dimensional and three-dimensional dynamic object real-time perception technology is proposed,which can quickly recognize the dynamic object in the detection environment while realizing real-time three-dimensional reconstruction of the target scene.
作者 何维龙 苏玲莉 郭丙轩 李茂森 郝岩 HE Weilong;SU Lingli;GUO Bingxuan;LI Maosen;HAO Yan(Jiuquan Vocational and Technical College,Jiuquan,Gansu 735000,China;State Key Laboratory of Information Engineeringin Surveying Mapping and Remote Sensing,Wuhan University,Wuhan 430072,China;Nuclear Industry Aerial Surveying and Remote Sensing Center,Baoding,Hebei 071799,China)
出处 《计算机科学》 CSCD 北大核心 2024年第S02期506-513,共8页 Computer Science
基金 国家重点研发计划(2019YFE0108300) 国家自然科学基金(62001058) 2023年甘肃省高等学校创新基金项目(2023B-449) 2023年酒泉市科技支撑项目(2060499) 校级科研项目(2022XJYXM06)。
关键词 双目视觉 立体匹配 图像分割 三维重建 深度学习 GPU并行计算 Binocular vision Stereo matching Image segmentation 3D reconstruction Depth learning GPU parallel computing
  • 相关文献

参考文献5

二级参考文献100

  • 1丰江帆,张宏,沙月进.GPS车载移动视频监控系统的设计[J].测绘通报,2007(2):52-54. 被引量:12
  • 2Aly,M.Real time Detection of Lane Markers in Urban Streets[J].IEEE Intelligent Vehicles Symposium,2008:7-12.
  • 3Li Zhang,Ee-yong Wu.A Road Segmentation and Road Type Identification Approach Based on New-Type Histogram Calculation[J].2nd IEEE International Congress on Image and Signal Processing,2009:1-5.
  • 4Hui Kong,Jean-Yves Audibert,Jean Ponce.Vanishing Point Detection for Road Detection[J].IEEE Conference on Computer Vision and Pattern Recognition,2009:96-103.
  • 5Junhwa Hur,Seung-Nam Kang,Seung-Woo Seo.Multi-lane Detection in Urban Driving Environments using Conditional Random Fields[J].IEEE Intelligent Vehicles Symposium,2013:1297-1302.
  • 6Jan Siegemund,Uwe Franke,Wolfgang Forstner.A Temporal Filter Approach for Detection and Reconstruction of Curbs and Road Surfaces based on Conditional Random Fields[J].IEEE Intelligent Vehicles Symposium,2011,30(1):637-642.
  • 7Zehang Sun,George Bebis,Ronald Miller.On-road Vehicle Detection:A Review[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(5):694-711.
  • 8Y.-M.Chan,Shih-Shinh Huang,Li-Chen Fu,et al.Vehicle Detection and Tracking under Various Lighting Conditions using a Particle Filter[J].Intelligent Transport Systems,2012,6(1):1-8.
  • 9Payam Sabzmeydani,Greg Mori.Detecting Pedestrians by Learning Shapelet Feature[J].IEEE Conference on Computer Vision and Pattern Recognition,2010:1-8.
  • 10Zhe Lin,Larry S.Davis.A Pose-Invariant Descriptor for Human Detection and Segmentation[J].Proceedings of the 10th European Conference on Computer Vision:Part IV Springer-Verlag,2008:423-436.

共引文献229

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部