融合点云与图像的环境目标检测研究进展

Survey on the fusion of point clouds and images forenvironmental object detection

导出

摘要在数字仿真技术应用领域,特别是在自动驾驶技术的发展中,目标检测是至关重要的一个环节,它涉及对周围环境中物体的感知,为智能装备的决策和规划提供了关键信息。近年来,随着传感器技术的进步,图像和点云成为两种主要的感知数据源,它们各自在基于深度学习技术的目标检测方法研究中具有独特的优势。为了更加全面地对现有基于点云和图像的目标检测方法进行研究,本文对基于图像、点云及两者联合的3类目标检测算法进行系统的梳理和总结,旨在探索如何将这两种数据源融合起来,促进提高目标检测的准确性、稳定性和鲁棒性,并对融合点云和图像的环境目标检测发展方向进行展望。 In the field of digital simulation technology applications,especially in the development of autonomous driving,object detection is a crucial component.It involves the perception of objects in the surrounding environment,which pro⁃vides essential information for the decision-making process and planning of intelligent systems.Traditional object detectionmethods typically involve steps such as feature extraction,object classification,and position regression on images.How⁃ever,these methods are limited by manually designed features and the performance of classifiers,which restrict their effec⁃tiveness in complex scenes and for objects with significant variations.The advent of deep learning technology has led to thewidespread adoption of object detection methods based on deep neural networks.Notably,the convolutional neural network(CNN)has emerged as one of the most prominent approaches in this field.By leveraging multiple layers of convolution andpooling operations,CNNs are capable of automatically extracting meaningful feature representations from image data.Inaddition to image data,light detection and ranging(LiDAR)data play a crucial role in object detection tasks,particularly for 3D object detection.LiDAR data represent objects through a set of unordered and discrete points on their surfaces.Accurately detecting point cloud clusters representing objects and providing their pose estimation from these unorderedpoints is a challenging task.LiDAR data,with their unique characteristics,offer high-precision obstacle detection and dis⁃tance measurement,which contributes to the perception of surrounding roadways,vehicles,and pedestrian targets.In realworld autonomous driving and related environmental perception scenarios,using a single modality often presents numerouschallenges.For instance,while image data can provide a wide variety of high-resolution visual information such as color,texture,and shape,it is susceptible to lighting conditions.In addition,models may struggle to handle occlusions causedby objects obstructing the view due to inherent limitations in camera perspectives.Fortunately,LiDAR exhibits exceptionalperformance in challenging lighting conditions and excels at accurately spatially locating objects in diverse and harshweather scenarios.However,it possesses certain limitations.Specifically,the low resolution of LiDAR input data resultsin sparse point cloud when detecting distant targets.Extracting semantic information from LiDAR data is also more chal⁃lenging than that from image data.Thus,an increasing number of researchers are emphasizing multimodal environmentalobject detection.A robust multimodal perception algorithm can offer richer feature information,enhanced adaptability todiverse environments,and improved detection accuracy.Such capabilities empower the perception system to deliver reli⁃able results across various environmental conditions.Certainly,multimodal object detection algorithms also face certainlimitations and pressing challenges that require immediate attention.One challenge is the difficulty in data annotation.Annotating point cloud and image data is relatively complex and time consuming,particularly for large-scale datasets.Moreover,accurately labeling point cloud data is challenging due to their sparsity and the presence of noisy points.Addressing these issues is crucial for further advancements in multimodal object detection.Moreover,the data structureand feature representation of point cloud and image data,as two distinct perception modalities,differ significantly.Thecurrent research focus lies in effectively integrating the information from the two modalities and extracting accurate and com⁃prehensive features that can be utilized effectively.Furthermore,processing large-scale point cloud data are equally chal⁃lenging.Point cloud data typically encompass a substantial number of 3D coordinates,which necessitates greater demandson computing resources and algorithmic efficiency compared with pure image data.This study aims to summarize and refineexisting approaches to facilitate researchers in gaining a deeper and more efficient understanding of object detection algo⁃rithms that integrate images and point clouds.It classifies object detection algorithms based on multimodal fusion of pointclouds,images,and combinations of both.Furthermore,we analyze the strengths and weaknesses of various methodswhile discussing potential solutions.Moreover,we provide a comprehensive review of the development of object detectionalgorithms that fuse point clouds and images,with considerations of aspects such as data collection,representation,andmodel design.Ultimately,we give a perspective on the future development direction of environmental target detection,andthe goal is to enhance overall capabilities in autonomous systems.

作者贾明达杨金明孟维亮郭建伟张吉光张晓鹏 Jia Mingda;Yang Jinming;Meng Weiliang;Guo Jianwei;Zhang Jiguang;Zhang Xiaopeng(State Key Laboratory of Multimodal Artificial Intelligence Systems,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China;School of Artificial Intelligence,University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区中国科学院自动化研究所多模态人工智能系统全国重点实验室中国科学院大学人工智能学院

出处《中国图象图形学报》 CSCD 北大核心 2024年第6期1765-1784,共20页 Journal of Image and Graphics

基金北京市自然科学基金-丰台轨道交通前沿研究联合基金(L231013) 国家自然科学基金(U21A20515,62376271,62172416,52175493,U22B2034,62365014) 北京航空航天大学虚拟实现国家重点实验室开放课题(VRLAB2023B01)。

关键词点云自动驾驶多模态目标检测融合 point cloud autonomous driving multimodal object detection fusion

分类号 TP391.7 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献5

1黄哲,王永才,李德英.3D目标检测方法研究综述[J].智能科学与技术学报,2023,5(1):7-31. 被引量：3
2Hantong XU,Jiamin XU,Weiwei XU.Survey of 3D modeling using depth cameras[J].Virtual Reality & Intelligent Hardware,2019,1(5):483-499. 被引量：4
3刘祥,李辉,程远志,孔祥振,陈双敏.图像与点云多重信息感知关联的三维多目标跟踪[J].中国图象图形学报,2024,29(1):163-178. 被引量：1
4晋帅,李煊鹏,杨凤,张为公.伪激光点云增强的道路场景三维目标检测[J].中国图象图形学报,2023,28(11):3520-3535. 被引量：3
5曹家乐,李亚利,孙汉卿,谢今,黄凯奇,庞彦伟.基于深度学习的视觉目标检测技术综述[J].中国图象图形学报,2022,27(6):1697-1722. 被引量：72

二级参考文献18

1尹宏鹏,陈波,柴毅,刘兆栋.基于视觉的目标检测与跟踪综述[J].自动化学报,2016,42(10):1466-1489. 被引量：293
2姚群力,胡显,雷宏.基于多尺度融合特征卷积神经网络的遥感图像飞机目标检测[J].测绘学报,2019,48(10):1266-1274. 被引量：37
3姜文涛,张驰,张晟翀,刘万军.多尺度特征图融合的目标检测[J].中国图象图形学报,2019,24(11):1918-1931. 被引量：15
4赵永强,饶元,董世鹏,张君毅.深度学习目标检测方法综述[J].中国图象图形学报,2020,25(4):629-654. 被引量：216
5罗会兰,陈鸿坤.基于深度学习的目标检测研究综述[J].电子学报,2020,48(6):1230-1239. 被引量：143
6严娟,方志军,高永彬.结合混合域注意力与空洞卷积的3维目标检测[J].中国图象图形学报,2020,25(6):1221-1234. 被引量：3
7李晖晖,周康鹏,韩太初.基于CReLU和FPN改进的SSD舰船目标检测[J].仪器仪表学报,2020,41(4):183-190. 被引量：41
8张燕咏,张莎,张昱,吉建民,段逸凡,黄奕桐,彭杰,张宇翔.基于多模态融合的自动驾驶感知及计算[J].计算机研究与发展,2020,57(9):1781-1799. 被引量：21
9刘旖菲,胡学敏,陈国文,刘士豪,陈龙.视觉感知的端到端自动驾驶运动规划综述[J].中国图象图形学报,2021,26(1):49-66. 被引量：7
10李宗民,姚纯纯,刘玉杰,李华.点云场景下基于结构感知的车辆检测[J].计算机辅助设计与图形学学报,2021,33(3):405-412. 被引量：6

共引文献77

1晁丽君,熊智,杨闯,华冰,王雅婷,刘建业.无人飞行器三维类脑SLAM自主导航方法[J].飞控与探测,2020,3(5):35-43. 被引量：2
2王荣浩,徐斌,张文博,盛博文,王景云.钢卷自动拆捆机器人捆带定位系统研发与应用[J].制造业自动化,2022,44(9):155-158. 被引量：2
3徐凤仪,王萍,黎博文,于昊冉.基于Kinect人体点云三维重建的个性化旗袍虚拟试穿[J].微型电脑应用,2022,38(10):108-112. 被引量：2
4袁珑,李秀梅,潘振雄,孙军梅,肖蕾.面向目标检测的对抗样本综述[J].中国图象图形学报,2022,27(10):2873-2896. 被引量：11
5蒋超,张豪,章恩泽,惠展,乐云亮.基于改进YOLOv5s的行人车辆目标检测算法[J].扬州大学学报（自然科学版）,2022,25(6):45-49. 被引量：14
6王彦雅,李卫东,张伟娜,李晓娟.基于自适应特征融合的小目标检测算法[J].河北省科学院学报,2022,39(6):9-16.
7叶智勇,操思薇,张楠,李琦,郏晨植.基于AIoT技术的高校自习室智慧光环境营造系统设计与实现[J].软件,2022,43(12):183-186. 被引量：1
8李小波,李阳贵,郭宁,范震.融合注意力机制的YOLOv5口罩检测算法[J].图学学报,2023,44(1):16-25. 被引量：20
9梁义涛,韩永波,李磊.深度长时目标跟踪算法综述[J].计算机工程与应用,2023,59(4):1-17. 被引量：1
10江子昂,李小玉,李洋,罗力,张泽世.基于YOLOv5s的城市道路车辆检测方法的设计与实现[J].电脑知识与技术,2023,19(3):19-21.

1李国昌.数字仿真技术在机械设计中的应用探析[J].时代汽车,2024(9):26-28.
2杨祥,王华彬,董明刚.改进的YOLOv5图书梯标检测算法[J].桂林理工大学学报,2024,44(2):330-338.
3焦伟伟(文/图).加强党建工作与业务工作融合促进提高国有企业党建工作质量的思考与实践[J].能源新观察,2024(2):46-47.
4马志超.双碳背景下天然气行业数字化转型策略研究[J].智能城市应用,2024,7(8):79-81.
5白应卿.基于“全生命周期”建筑室内设计专业数字仿真技术教学应用研究[J].太原城市职业技术学院学报,2024(6):120-122.
6刘懿,李勋祥,李毅.温州瓯窑烧制技艺的数字仿真技术与研究[J].艺术与设计（理论版）,2024(4):92-95.
7叶锦绵.小学语文“备—教—学—评”一体化的单元整体教学路径探究——以统编版四年级上册第四单元教学为例[J].福建基础教育研究,2024(6):49-51.

中国图象图形学报

2024年第6期

浏览历史

内容加载中请稍等...

融合点云与图像的环境目标检测研究进展

参考文献5

二级参考文献18

共引文献77

相关作者

相关机构

相关主题

浏览历史