期刊文献+

基于深度与实例分割融合的单目3D目标检测方法

Monocular 3D object detection method integrating depth and instance segmentation
下载PDF
导出
摘要 针对单目3D目标检测在视角变化引起的物体大小变化以及物体遮挡等情况下效果不佳的问题,提出一种融合深度信息和实例分割掩码的新型单目3D目标检测方法。首先,通过深度-掩码注意力融合(DMAF)模块,将深度信息与实例分割掩码结合,以提供更准确的物体边界;其次,引入动态卷积,并利用DMAF模块得到的融合特征引导动态卷积核的生成,以处理不同尺度的物体;再次,在损失函数中引入2D-3D边界框一致性损失函数,调整预测的3D边界框与对应的2D检测框高度一致,以提高实例分割和3D目标检测任务的效果;最后,通过消融实验验证该方法的有效性,并在KITTI测试集上对该方法进行验证。实验结果表明,与仅使用深度估计图和实例分割掩码的方法相比,在中等难度下对车辆类别检测的平均精度提高了6.36个百分点,且3D目标检测和鸟瞰图目标检测任务的效果均优于D4LCN(Depth-guided Dynamic-Depthwise-Dilated Local Convolutional Network)、M3D-RPN(Monocular 3D Region Proposal Network)等对比方法。 To address the limitations of monocular 3D object detection,when encountering changing object size due to changing perspective and occlusion,a new monocular 3D object detection method was proposed fusing depth information with instance segmentation masks.Firstly,with the help of the Depth-Mask Attention Fusion(DMAF)module,depth information was combined with instance segmentation masks to provide more accurate object boundaries.Secondly,dynamic convolution was introduced,and the fused features obtained from the DMAF module were used to guide the generation of dynamic convolution kernels for dealing with objects of different scales.Moreover,a 2D-3D bounding box consistency loss function was introduced into loss function,adjusting the predicted 3D bounding box to highly coincide with corresponding 2D detection box,thereby enhancing performance in instance segmentation and 3D object detection tasks.Lastly,the effectiveness of the proposed method was confirmed through ablation studies and validated on the KITTI test set.The results indicate that,compared to methods using only depth estimation maps and instance segmentation masks,the proposed method improves the average accuracy of vehicle detection under medium difficulty by 6.36 percentage points,and it outperforms comparative techniques like D4LCN(Depth-guided Dynamic-Depthwise-Dilated Local Convolutional Network)and M3D-RPN(Monocular 3D Region Proposal Network)in both 3D object detection and aerial view object detection tasks.
作者 孙逊 冯睿锋 陈彦如 SUN Xun;FENG Ruifeng;CHEN Yanru(Line Station Design and Research Institute,China Railway Siyuan Survey and Design Group Company Limited,Wuhan Hubei 430063,China;College of Economics and Management,Southwest Jiaotong University,Chengdu Sichuan 610031,China)
出处 《计算机应用》 CSCD 北大核心 2024年第7期2208-2215,共8页 journal of Computer Applications
基金 国家自然科学基金资助项目(62173279)。
关键词 单目3D目标检测 深度学习 动态卷积 实例分割 monocular 3D object detection deep learning dynamic convolution instance segmentation
  • 相关文献

参考文献2

二级参考文献7

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部