期刊文献+

基于点云数据的三维目标检测技术研究进展 被引量:5

Three-Dimensional Object Detection Technology Based on Point Cloud Data
原文传递
导出
摘要 近年来,随着深度传感器和三维激光扫描设备的普及,点云数据引起了广泛关注。相对于二维图像,点云数据不仅包含场景的深度信息,还不受光照等环境因素的影响,能够更精确地实现目标识别和三维定位。因此,基于点云的三维目标检测技术已经成为智能空间感知和场景理解的关键技术。本文首先介绍了点云数据的特点,并探讨了不同类型的点云特征提取方法;其次,详细阐述了基于体素、点、图以及体素与点混合的点云目标检测方法的原理和发展历程;然后,介绍了常见的室内外点云目标检测数据集和评价指标,并对各类点云目标检测方法在KITTI和Waymo数据集上的性能进行了详细的比较和分析;最后,对点云目标检测技术的研究进展进行了总结和展望。 Significance In recent years,self-driving technology has garnered considerable attention from both academia and industry.Autonomous perception,which encompasses the perception of the vehicle's state and the surrounding environment,is a critical component of self-driving technology,guiding decision-making and planning modules.In order to perceive the environment accurately,it is necessary to detect objects in three-dimensional(3D)scenes.However,traditional 3D object detection techniques are typically based on image data,which lack depth information.This makes it challenging to use image-based object detection in 3D scene tasks.Therefore,3D object detection predominantly relies on point cloud data obtained from devices such as lidar and 3D scanners.Point cloud data consist of a collection of points,with each containing coordinate information and additional attributes such as color,normal vector,and intensity.Point cloud data are rich in depth information.However,in contrast to twodimensional images,point cloud data are sparse and unordered,and they exhibit a complex and irregular structure,posing challenges for feature extraction processes.Traditional methods rely on local point cloud information such as curvature,normal vector,and density,combined with methods such as the Gaussian model to manually design descriptors for processing point cloud data.However,these methods rely heavily on a priori knowledge and fail to account for the relationships between neighboring points,resulting in low robustness and susceptibility to noise.In recent years,deep learning methods have gained significant attention from researchers due to their robust feature representation and generalization capabilities.The effectiveness of deep learning methods relies heavily on high-quality datasets.To advance the field of point cloud object detection,numerous companies such as Waymo and Baidu,as well as research institutes have produced large-scale point cloud datasets.With the help of such datasets,point cloud object detection combined with deep learning has rapidly developed and demonstrated powerful performance.Despite the progress made in this field,challenges related to accuracy and real-time performance still exist.Therefore,this paper provides a review of the research conducted in point cloud object detection and looks forward to future developments to promote the advancement of this field.Progress The development of point cloud object detection has been significantly promoted by the recent emergence of large-scale open-source datasets.Several standard datasets for outdoor scenes,including KITTI,Waymo,and nuScenes,as well as indoor scenes,including NYU-Depth,SUN RGB-D,and ScanNet,have been released,which have greatly facilitated research in this field.The relevant properties of these datasets are summarized in Table 1.Point cloud data are characterized by sparsity,non-uniformity,and disorder,which distinguish them from image data.To address these unique properties of point clouds,researchers have developed a range of object detection algorithms specifically designed for this type of data.Based on the methods of feature extraction,point cloud-based single-modal methods can be categorized into four groups:voxel-based,point-based,graph-based,and point+voxel-based methods.Voxel-based methods divide the point cloud into regular voxel grids and aggregate point cloud features within each voxel to generate regular four-dimensional feature maps.VoxelNet,SECOND,and PointPillars are classic architectures of this kind of method.Point-based methods process the point cloud directly and utilize symmetric functions to aggregate point cloud features while retaining the geometric information of the point cloud to the greatest extent.PointNet,PointNet++,and Point R-CNN are their classic architectures.Graph-based methods convert the point cloud into a graph representation and process it through the graph neural network.Point GNN and Graph R-CNN are classic architectures of this approach.Point+voxel-based methods combine the methods based on point and those based on voxel,with STD and PV R-CNN as classic architectures.In addition,to enhance the semantic information of point cloud data,researchers have used image data to supplement secondary information to design multi-modal methods.MV3D,AVOD,and MMF are classic architectures of multi-modal methods.A chronological summary of classical methods for object detection from point clouds is presented in Fig.4.Conclusions and Prospects The field of 3D object detection from point clouds is a significant research area in computer vision that is gaining increasing attention from scholars.The foundational branch of 3D object detection from point clouds has flourished,and future research may focus on several areas.These include multi-branch and multi-mode fusion,the integration of two-dimensional detection methods,weakly supervised and self-supervised learning,and the creation and utilization of complex datasets.
作者 李佳男 王泽 许廷发 Li Jianan;Wang Ze;Xu Tingfa(School of Optoelectronics,Beijing Institute of Technology,Beijing 100081,China;Key Laboratory of Photoelectronic Imaging Technology and System,Ministry of Education,Beijing Institute of Technology,Beijing 100081,China;Chongqing Innovation Center,Beijing Institute of Technology,Chongqing 401135,China)
出处 《光学学报》 EI CAS CSCD 北大核心 2023年第15期286-302,共17页 Acta Optica Sinica
基金 国家自然科学基金青年科学基金(62101032)。
关键词 点云 三维目标检测 单模态 多模态 point cloud 3D object detection single modality multi-modality
  • 相关文献

参考文献1

二级参考文献10

共引文献17

同被引文献41

引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部