摘要
受到场景的复杂性和目标尺度变化、遮挡等影响,三维目标检测仍面临着诸多挑战.虽然跨模态特征融合图像和激光点云信息能够有效地提升三维目标检测性能,但在融合效果和检测性能上仍有待提升,为此,提出图像语义特征引导与点云跨模态融合的三维目标检测方法.首先设计图像语义特征学习网络,采用双分支自注意力并行计算方式,实现全局语义特征增强,降低目标错误分类;然后提出图像语义特征引导的局部融合模块,采用元素级数据拼接将检索的图像局部语义特征引导融合点云数据,更好地解决跨模态信息融合存在的语义对齐问题;提出多尺度再融合网络,设计融合特征与激光雷达点云交互模块,学习融合特征和不同分辨率特征间的再融合,提高网络的检测性能;最后采用4种任务损失实现anchor-free的三维目标检测.在KITTI和nuScenes数据集中与其他方法进行对比,针对三维目标检测准确率达87.15%,并且实验结果表明,文中方法优于对比方法,具有更优的三维检测性能.
Due to the complexity of scenes,the influence of object scale changes and occlusions etc.,object de-tection still face many challenges.Cross-modal feature fusion of image and laser point cloud information can ef-fectively improve the performance of 3D object detection,but the fusion effect and detection performance still need to be improved.Therefore,this paper first designs an image semantic feature learning network,which adopts a position and channel dual-branch self-attention parallel computing method,achieves global semantic enhance-ment,to reduce target misclassification.Secondly,a local semantic fusion module with image semantic feature guidance is proposed,which uses element-level data splicing to guide and fuse point cloud data with the local semantic features of the retrieved images,so as to better solve the problem of semantic alignment in cross-modal information fusion.A multi-scale re-fusion network is proposed,and the interaction module between the fusion features and LiDAR is designed to learn multi-scale connections in fusion features and re-fusion between features of different resolutions,so as to improve the detection performance.Finally,four task losses are adopted to per-form anchor-free 3D multi-object detector.Comparing with other methods in KITTI and nuScenes datasets,the detection accuracy for 3D objects is 87.15%,and the experimental results show that the method in this paper out-performs the comparison methods and has better 3D detection performance.
作者
李辉
王俊印
程远志
刘健
赵国伟
陈双敏
Li Hui;Wang Junyin;Cheng Yuanzhi;Liu Jian;Zhao Guowei;Chen Shuangmin(School of Information Science and Technology,Qingdao University of Science and Technology,Qingdao 266061;Faculty of Computing,Harbin Institute of Technology,Harbin 150001;College of Computer Science,Nankai University,Tianjin 300071)
出处
《计算机辅助设计与图形学学报》
EI
CSCD
北大核心
2024年第5期734-749,共16页
Journal of Computer-Aided Design & Computer Graphics
基金
国家自然科学基金(62002190,61702295)
国家重点研发计划(2023YFF0612102)
山东省自然科学基金(ZR2020MF036).
关键词
三维目标检测
跨模态
语义特征
点云
无锚
3D object detection
cross-modal
semantic feature
point cloud
anchor-free