期刊文献+

一种多层多模态融合3D目标检测方法

3D Object Detection Based on Multilayer Multimodal Fusion
下载PDF
导出
摘要 在自动驾驶感知系统中视觉传感器与激光雷达是关键的信息来源,但在目前的3D目标检测任务中大部分纯点云的网络检测能力都优于图像和激光点云融合的网络,现有的研究将其原因总结为图像与雷达信息的视角错位以及异构特征难以匹配,单阶段融合算法难以充分融合二者的特征.为此,本文提出一种新的多层多模态融合的3D目标检测方法:首先,前融合阶段通过在2D检测框形成的锥视区内对点云进行局部顺序的色彩信息(Red Green Blue,RGB)涂抹编码;然后将编码后点云输入融合了自注意力机制上下文感知的通道扩充PointPillars检测网络;后融合阶段将2D候选框与3D候选框在非极大抑制之前编码为两组稀疏张量,利用相机激光雷达对象候选融合网络得出最终的3D目标检测结果.在KITTI数据集上进行的实验表明,本融合检测方法相较于纯点云网络的基线上有了显著的性能提升,平均mAP提高了6.24%. Camera and lidar are the key sources of information in autonomous vehicles(AVs).However,in the cur⁃rent 3D object detection tasks,most of the pure point cloud network detection capabilities are better than those of image and laser point cloud fusion networks.Existing studies summarize the reasons for this as the misalignment of view between im⁃age and radar information and the difficulty of matching heterogeneous features.Single-stage fusion algorithm is difficult to fully fuse the features of both.For this reason,a nova 3D object detection based on multilayer multimodal fusion(3DMMF)is presented.First,in the early-fusion phase,point clouds are encoded locally by Frustum-RGB-PointPainting(FRP)formed by the 2D detection frame.Then,the encoded point cloud input is combined with the self-attention mechanism contextaware channel to expand the PointPillars detection network.In the later-fusion phase,2D and 3D candidate boxes are coded as two sets of sparse tensors before they are not greatly suppressed,and the final 3D target detection result is obtained by us⁃ing the camera lidar object candidates fusion(CLOCs)network.Experiments on KITTI datasets show that this fusion detec⁃tion method has a significant performance improvement over the baseline of pure point cloud networks,with an average mAP improvement of 6.24%.
作者 周治国 马文浩 ZHOU Zhi-guo;MA Wen-hao(School of Integrated Circuits and Electronics,Beijing Institute of Technology,Beijing 100081,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2024年第3期696-708,共13页 Acta Electronica Sinica
基金 装备预研领域基金(No.61403120109)。
关键词 自动驾驶 多传感器融合 3D目标检测 点云编码 自注意力机制 auto-driving multi-sensor fusion 3D target detection point cloud coding self-attention mechanism
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部