摘要
环境感知是自动驾驶汽车落地的关键技术之一,它对于提高自动驾驶汽车的安全性和可靠性至关重要.三维目标检测是其中的一项核心任务,旨在识别和定位三维空间中的物体,为后续决策提供重要的信息.点云和图像是该任务最常用的输入数据,点云由三维空间中不规则分布的点组成,而图像则是由二维空间上规则分布的像素组成.因此,点云和图像之间难以进行有效的融合.而伪点云作为一种点云表征的图像信息,近几年受到了该领域学者的广泛关注.现阶段基于伪点云的三维目标检测方法还存在伪点云特征提取粗糙和相应感兴趣区域(Region-of-Intersts,RoI)特征表征能力差的问题.本文针对上述问题开展研究,分别提出细粒度注意力卷积和多尺度分组稀疏卷积.细粒度注意力卷积将规则图像处理中常用的深度可分离卷积引入不规则点云的处理流程,并在此基础上嵌入通道和分组注意力机制,进行精细的特征提取,增强伪点云特征;多尺度分组稀疏卷积将格网池化后的Ro I特征分组,进行差异化特征学习,获取不同尺度的Ro I特征,增强伪点云Ro I格网特征的表征能力.基于此,本文在SFD(Sparse Fuse Dense)网络的伪点云特征提取流程中引入细粒度注意力卷积,同时在其伪点云Ro I特征学习流程中引入多尺度分组稀疏卷积,构建SFD++多模态三维目标检测网络.在权威KITTI自动驾驶数据集上的实验结果表明,SFD++每秒可以处理8.33帧数据,其精度在简单、中等和困难的三维汽车检测上达到95.74%、88.80%和86.04%,比次优SFD的精度高出0.15%、0.84%和0.58%.除此之外,一系列消融和补充实验结果验证了所提出卷积的有效性和相关参数设置的合理性.
Environment perception is one of the key technologies to ensure the landing of autono-mous vehicle,and it is crucial to improve the safety and reliability of autonomous vehicle.Three dimensional(3D)object detection is a core task in environment perception technology.It aims at identifying and locating different objects in 3D space,and provides important information for the subsequent decision-making and action of autonomous vehicle.Point clouds and images are the most commonly used input data for 3D object detection task.However,point clouds are composed of irregularly distributed scattered points in 3D space,while images are composed of regularly distributed pixels in 2D space.Therefore,it is difficult to map and fuse irregularly distributed point clouds and regularly distributed image pixels in an effective way.In recent years,as a type of image information input in the form of point cloud,image pseudo point clouds have received widespread attention from researchers in this field.But 3D object detection methods based on image pseudo point clouds still have some issues.On the one hand,the feature extraction process for image pseudo point clouds is still relatively rough.On the other hand,the representation ability of Region of Interest(RoI)features of image pseudo point clouds is still poor.This paper conducted research on the above two issues,proposed fine-grained attention convolution and multi-scale group sparse convolution,respectively.Fine-grained attention convolution introduces the commonly used depth-wise separable convolution in regular image processing into irregular point cloud processing process.On this basis,channel attention mechanism and group attention mechanism are embedded for fine image pseudo point cloud feature extraction,and to enhance image pseudo point cloud features.This kind of convolution can enhance the fine-grained information of image pseudo point cloud features.Multi-scale group sparse convolution groups the image pseudo point cloud Rol features after grid pooling,performs differential feature learning on the grouped Rol features to obtain Rol features at different scales.This kind of convolution can enhance the representation ability of Rol features grouped from image pseudo point clouds.On this basis,this paper constructed SFD++multi-modal 3D object detection network,which introduced the proposed fine-grained attention convolution into the feature extraction process of image pseudo point cloud,and introduced multi-scale group sparse convolution into its RoI features learning process of image pseudo point cloud in SFD(Sparse Fuse Dense)3D object detection network.Experiments were carried out on the authoritative KITTI autonomous driving dataset,and the results show that the constructed SFD++can process 8.33 frames of data per second.It achieved an average precision of 95.74%,88.80%,and 86.04%in easy,moderate and hard 3D car detection subsets,respectively.The achieved average precision is 0.15%,0.84%,and 0.58%higher than the current second-best 3D object detection network SFD.In addition,a series of ablation experiments and supplementary experiments were also conducted on KITTI datasets,the results verified the effectiveness of the proposed fine-grained attention convolution and multi-scale group sparse convolution,as well as the rationality of related parameter settings.
作者
孔德明
李晓伟
杨庆鑫
KONG De-Ming;LI Xiao-Wei;YANG Qing-Xin(School of Electrical Engineering,Yanshan University,Qinhuangdao,Hebei 066004)
出处
《计算机学报》
EI
CAS
CSCD
北大核心
2024年第4期759-775,共17页
Chinese Journal of Computers
基金
国家自然科学基金面上项目(No.62173289)
航空科学基金(No.20200016099002)
国家自然科学基金青年科学基金(No.61501394)资助。
关键词
自动驾驶
三维目标检测
伪点云
注意力机制
深度可分离卷积
组卷积
autonomous driving
3D object detection
pseudo point cloud
attention mechanism
depth-wise separable convolution
group convolution