基于双目估计的动态场景三维感知技术研究与实现

Research and Implementation of Dynamic Scene 3D Perception Technology Based on Binocular Estimation

下载PDF

导出

摘要双目立体视觉技术在计算机视觉领域研究中一直具有重要意义。不同于单目或多目技术,双目立体视觉在能够准确获取图像深度的同时,也兼具了低成本、高泛用性、使用简便等优势。基于双目视觉的三维感知技术能够极大提升计算机对现实世界的理解和交互能力,进一步增强计算机视觉技术在复杂、多变的场景中的适应能力,在自动驾驶、机器人导航、工业检测、航天等领域发挥着重要作用。文中重点研究动态场景中的三维重建与目标感知技术,在大多数情况中,视野中的动态目标实际上是需要重点关注的目标,而静态目标,特别是在场景中绝大多数时候都占据主要空间的背景以及静态物体往往是可以被忽略掉的,但是在实际计算时确占用了大量资源。在场景中不受关注的目标上花费过多计算资源,显然是无意义且非常低效的。针对这个问题,本文在深入研究了目前主流的双目立体匹配方法、图像分割等方法的基础上,提出了一种基于双目估计的动态场景三维感知技术。主要的创新点和研究成果包括:针对传统双目立体匹配算法逐像素计算聚合低价效率低下的问题,提出了一种基于二维场景实例分割的双目立体匹配方法,使用mask分割后的目标图像进行立体匹配,这样不仅提升了匹配性能,同时也降低了动态目标的匹配难度。针对分割精确不足的问题,引入基于RGB图像的mask边缘滤波优化方法,在提升效率的同时提升视场点云重建精度。其次,基于双目估计深度学习网络进行实时目标点云生产,并提出基于GPU加速的邻近帧点云的实时动态目标感知算法。最后提出二三维一体的动态目标实时感知技术,在对目标场景实现实时三维重建的同时,快速识别检测环境中的动态目标物体。 Binocular stereo vision technology has always been of great significance in the field of computer vision research.Unlike monocular or multicular technology,binocular stereo vision has the advantages of low cost,high versatility,simple use and so on while it can accurately obtain the image depth.The three-dimensional perception technology based on binocular vision can greatly improve the computer’s understanding and interaction ability to the real world,further enhance the adaptability of computer vision technology in complex and changeable scenes,and play an important role in the fields of automatic driving,robot navigation,industrial inspection,aerospace,etc.This paper focuses on 3D reconstruction and object perception technology in dynamic scenes.In most cases,dynamic objects in the field of vision usually need to be focused on,while static objects,especially the background and static objects in the scene that occupy the main space in most cases,can be ignored,but they do occupy a lot of resources in the actual calculation,It is obviously meaningless and inefficient to spend too much computing resources on targets that are not concerned in the scene.In order to solve this problem,based on the in-depth study of the current mainstream binocular stereo matching methods,image segmentation and other methods,this paper proposes a dynamic scene 3D perception technology based on binocular estimation.The main innovations and research achievements include:Aiming at the low cost and efficiency of the traditional binocular stereo matching algorithm in pixel by pixel computing aggregation,a binocular stereo matching method based on two-dimensional scene instance segmentation is proposed,and the target image after mask segmentation is used for stereo matching,which not only improves the matching performance but also reduces the difficulty of dynamic target matching.At the same time,in order to solve the problem of insufficient segmentation accuracy,the mask edge filtering optimization method based on rgb image is introduced to improve the efficiency and the reconstruction accuracy of the field of view point cloud.Secondly,real-time target point cloud production is carried out based on binocular estimation depth learning network,and a real-time dynamic target perception algorithm based on GPU accelerated neighboring frame point cloud is proposed.At last,a two-dimensional and three-dimensional dynamic object real-time perception technology is proposed,which can quickly recognize the dynamic object in the detection environment while realizing real-time three-dimensional reconstruction of the target scene.

作者何维龙苏玲莉郭丙轩李茂森郝岩 HE Weilong;SU Lingli;GUO Bingxuan;LI Maosen;HAO Yan(Jiuquan Vocational and Technical College,Jiuquan,Gansu 735000,China;State Key Laboratory of Information Engineeringin Surveying Mapping and Remote Sensing,Wuhan University,Wuhan 430072,China;Nuclear Industry Aerial Surveying and Remote Sensing Center,Baoding,Hebei 071799,China)

机构地区酒泉职业技术学院武汉大学测绘遥感信息工程国家重点实验室核工业航测遥感中心

出处《计算机科学》 CSCD 北大核心 2024年第S02期506-513,共8页 Computer Science

基金国家重点研发计划(2019YFE0108300) 国家自然科学基金(62001058) 2023年甘肃省高等学校创新基金项目(2023B-449) 2023年酒泉市科技支撑项目(2060499) 校级科研项目(2022XJYXM06)。

关键词双目视觉立体匹配图像分割三维重建深度学习 GPU并行计算 Binocular vision Stereo matching Image segmentation 3D reconstruction Depth learning GPU parallel computing

分类号 P231 [天文地球—摄影测量与遥感]

引文网络
相关文献

参考文献5

1方路平,何杭江,周国民.目标检测算法研究综述[J].计算机工程与应用,2018,54(13):11-18. 被引量：116
2吴群,王田,王汉武,赖永炫,钟必能,陈永红.现代智能视频监控研究综述[J].计算机应用研究,2016,33(6):1601-1606. 被引量：70
3张贵英,向函,赵勇.基于计算机视觉的自动驾驶算法研究综述[J].贵州师范学院学报,2016,32(6):14-19. 被引量：7
4李强辉,龙雪峰,农振良,全军利,吴昌耀.数字医学三维重建技术在小儿腹部闭合性损伤的临床应用[J].中外医学研究,2021,19(3):191-193. 被引量：6
5张彦雯,胡凯,王鹏盛.三维重建算法研究综述[J].南京信息工程大学学报（自然科学版）,2020,12(5):591-602. 被引量：35

二级参考文献100

1丰江帆,张宏,沙月进.GPS车载移动视频监控系统的设计[J].测绘通报,2007(2):52-54. 被引量：12
2Aly,M.Real time Detection of Lane Markers in Urban Streets[J].IEEE Intelligent Vehicles Symposium,2008:7-12.
3Li Zhang,Ee-yong Wu.A Road Segmentation and Road Type Identification Approach Based on New-Type Histogram Calculation[J].2nd IEEE International Congress on Image and Signal Processing,2009:1-5.
4Hui Kong,Jean-Yves Audibert,Jean Ponce.Vanishing Point Detection for Road Detection[J].IEEE Conference on Computer Vision and Pattern Recognition,2009:96-103.
5Junhwa Hur,Seung-Nam Kang,Seung-Woo Seo.Multi-lane Detection in Urban Driving Environments using Conditional Random Fields[J].IEEE Intelligent Vehicles Symposium,2013:1297-1302.
6Jan Siegemund,Uwe Franke,Wolfgang Forstner.A Temporal Filter Approach for Detection and Reconstruction of Curbs and Road Surfaces based on Conditional Random Fields[J].IEEE Intelligent Vehicles Symposium,2011,30(1):637-642.
7Zehang Sun,George Bebis,Ronald Miller.On-road Vehicle Detection:A Review[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(5):694-711.
8Y.-M.Chan,Shih-Shinh Huang,Li-Chen Fu,et al.Vehicle Detection and Tracking under Various Lighting Conditions using a Particle Filter[J].Intelligent Transport Systems,2012,6(1):1-8.
9Payam Sabzmeydani,Greg Mori.Detecting Pedestrians by Learning Shapelet Feature[J].IEEE Conference on Computer Vision and Pattern Recognition,2010:1-8.
10Zhe Lin,Larry S.Davis.A Pose-Invariant Descriptor for Human Detection and Segmentation[J].Proceedings of the 10th European Conference on Computer Vision:Part IV Springer-Verlag,2008:423-436.

共引文献229

1邓彬,张宗包.基于变电站机房室内外一体的建模技术研究[J].中国测试,2023,49(S01):158-162.
2王君至,张忠山,沈大勇,黄静波,王沛,闫俊刚.基于YOLOv3的人员照片标识识别算法研究[J].网络安全与数据治理,2023,42(S01):221-225.
3陈丽君,薄纯娟,张俊星.基于UpYOLO的现实场景车标检测算法研究[J].计算机应用研究,2020,37(S01):400-402.
4陈金令,程茂凯,徐紫涵.改进型FCOS目标检测算法[J].计算机科学,2022,49(S02):467-472. 被引量：1
5李翔宇,王伟,王峰萍,韩岩江.面向密集场景结合TC-YOLOX的小目标检测方法[J].电子测量技术,2023,46(15):133-142. 被引量：1
6王颖,龚烨,尹泓澈,李礼,姚剑.多尺度联合特征点检测和描述网络[J].测绘地理信息,2022,47(S01):167-171.
7刘秀平,杜勇辰,冯奇,徐健,闫焕营,薛永建.基于ViBe的高光背景下工件目标检测[J].国外电子测量技术,2020,39(2):38-41.
8李春明,逯杉婷,远松灵,王震洲.基于Faster R-CNN的除草机器人杂草识别算法[J].中国农机化学报,2019,40(12):171-176. 被引量：22
9洪尘.乌蒙磅礴走泥丸——铁道部第18工程局争创一流采真[J].经营与管理,2000(5):23-25.
10刘鹏翼.智能视频分析技术及应用[J].网络安全技术与应用,2018(12):127-127. 被引量：4

1孙锦钊,慕爱东,孙梦阳.基于双目视觉的变电站巡检机器人故障点定位方法[J].今日自动化,2024(9):179-181.
2黄洋.基于深度学习的多视图立体匹配三维重建算法分析[J].电子技术（上海）,2024,53(8):302-303.
3徐海东,张自力,胡新荣,彭涛,张俊.基于改进超像素采样的立体匹配网络[J].计算机科学,2024,51(S02):514-520.
4李蒙,刘曾.基于双目立体视觉数据的波浪场重构研究[J].海洋工程,2024,42(5):157-164. 被引量：1
5林晓云.ChatGPT赋能高校图书馆智能服务的场景、路径与策略研究[J].华章,2024(16):0015-0017.
6张晓晶,池忠军.孤岛困境的纾解:能源科技课程思政三维一体教学法探析[J].煤炭高等教育,2024,42(1):41-49.
7杨鸥.院子里的树[J].北京文学（精彩阅读）,2024(11):191-193.
8王艳睿.“生态位”视域下数字教材的困境与革新对策[J].现代教学,2024(21):47-48.
9李德玉,肖龙飞,魏汉迪.基于双目立体视觉的实验室波浪场实时重建[J].船舶力学,2024,28(10):1463-1471.
10陈卓宇,安丰伟.面向机器人导航的双目立体视觉处理器综述[J].集成电路与嵌入式系统,2024,24(11):15-28.

计算机科学

2024年第S02期

浏览历史

内容加载中请稍等...

基于双目估计的动态场景三维感知技术研究与实现

参考文献5

二级参考文献100

共引文献229

相关作者

相关机构

相关主题

浏览历史