基于注意力机制和多级校正的单目室内场景深度估计

Depth Estimation of Monocular Indoor Scenes Based on Attention Mechanism and Multi-level Correction

下载PDF

导出

摘要场景的深度估计在三维视觉领域有着广泛的应用。针对单目室内场景深度估计精度低、细粒度信息预测能力差等问题,提出一种基于注意力机制和多级校正的单目深度估计网络。该网络首先采用混合自注意力Transformer和卷积神经网络的双分支模块提取彩色图像的多分辨率特征,然后利用基于空间域注意力机制的模块对提取的多分辨率特征进行渐进融合,最后通过多级校正的方式处理融合后的特征,并渐进地估计出不同分辨率的深度图像。实验结果表明,与同类方法相比,所提出的网络可有效提高深度图像细粒度信息的预测能力,网络的多个评价指标均有不同幅度的提升。 The depth estimation of scenes has a wide range of applications in the field of 3D vision.A monocular depth estimation network based on Attention Mechanism and multi-level correction is proposed to address the issues of low accuracy and poor prediction ability of fine-grained information in monocular indoor scene depth estimation.The network first uses a dual branch module with a self attention Transformer and a convolutional neural network to extract multi-resolution features of color images.Then,a module based on spatial domain Attention Mechanism is used to gradually fuse the extracted multi-resolution features.Finally,the fused features are processed through multi-level correction,and depth images with different resolutions are gradually estimated.The experimental results show that compared with similar methods,the proposed network can effectively improve the predictive ability of fine-grained information in depth images,and multiple evaluation indicators of the network have been improved to varying degrees.

作者刘鹏丁爱华窦新宇 LIU Peng;DING Aihua;DOU Xinyu(Intelligence and Information Engineering College,Tangshan University,Tangshan 063000,China)

机构地区唐山学院智能与信息工程学院

出处《现代信息科技》 2024年第5期106-110,共5页 Modern Information Technology

基金唐山市市级科技计划项目(22130205H)。

关键词单目深度估计 TRANSFORMER 注意力机制多级校正 monocular depth estimation Transformer Attention Mechanism multi-level correction

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1邢喜凤.中国社会政策的历史演进和路径依赖——基于历史制度主义的分析[J].社会工作与管理,2024,24(1):90-98.
2周燕,李文俊,党兆龙,曾凡智,叶德旺.深度学习的三维模型识别研究综述[J].计算机科学与探索,2024,18(4):916-929.
3朱希,李燕,施林枫.基于深度学习的密集物料检测方法[J].国外电子测量技术,2024,43(1):151-158.
4余映,徐超越,李淼,何鹏浩,杨昊.金字塔渐进融合低照度图像增强网络[J].国防科技大学学报,2024,46(2):224-237.
5欧阳兆聪.基于立体影像在地表覆盖信息提取中的应用研究[J].经纬天地,2024(1):54-57.

现代信息科技

2024年第5期

浏览历史

内容加载中请稍等...

基于注意力机制和多级校正的单目室内场景深度估计

相关作者

相关机构

相关主题

浏览历史