摘要
多视图立体匹配是计算机视觉领域的一大研究热点,针对目前多视图立体重建完整性差、无法处理高分辨率图像和GPU内存消耗巨大、运行时间长等问题,提出一种基于自注意力机制的深度学习网络(SA-PatchmatchNet)。首先通过特征提取模块提取图像特征,再将其送入可学习的Patchmatch模块中,得到深度图,并对深度图进行优化,生成最终的深度图。为了捕捉深度推理任务中的重要信息,将自注意力机制融入到特征提取模块,提高了网络的特征提取能力。实验结果表明,SA-PatchmatchNet在Technical University of Denmark(DTU)数据集上进行测试,与PatchmatchNet相比,重建的完整性提升5.8%,整体性提升2.3%,与其他的state-of-the-art(SOTA)方法相比,完整性和整体性都得到了较大的提升。
Multi-view stereo matching is a major hotspot in the field of computer vision.We propose a self-attentionbased deep learning network(SA-PatchmatchNet)to address the issues of poor completeness of multi-view stereo reconstruction,inability to process high-resolution images,huge GPU memory consumption,and long running time.First,the feature extraction module extracted the image features and sent them to the learnable Patchmatch module to obtain the depth map,and then the depth map was optimized to generate the final depth map.Moreover,the self-attention mechanism was integrated into the feature extraction module to capture the important information in the deep reasoning task,thereby enhancing the network feature extraction ability.The experimental results show that the reconstruction completeness is improved by 5.8%and the entirety is improved by 2.3%compared with that of the PatchmatchNet when the SA-PatchmatchNet is tested on the Technical University of Denmark(DTU)dataset.The completeness and entirety of the proposed network are significantly improved compared with that of the other state-of-the-art(SOTA)methods.
作者
朱光照
韦博
杨阿峰
徐欣
Zhu Guangzhao;Wei Bo;Yang Afeng;Xu Xin(School of Communication Engineering,Hangzhou Dianzi University,Hangzhou 310037,Zhejiang,China)
出处
《激光与光电子学进展》
CSCD
北大核心
2023年第16期315-322,共8页
Laser & Optoelectronics Progress
关键词
深度学习
三维重建
多视图立体
自注意力机制
deep learning
3D reconstruction
multi-view stereo
self-attention mechanism