摘要
针对多视图立体重建在光照不均匀、弱纹理、非朗伯表面等复杂场景中重建完整度差、泛化能力不足的问题,本文提出了一种基于注意力机制的多视图立体重建算法。在特征提取阶段,该算法采用基于深度可分离卷积和自注意力机制的多尺度特征提取模块,在扩大感受野的同时增强多视图间的空间特征关系,从而提升网络在复杂场景下特征的表征能力以实现更精确的特征匹配。在代价体正则化阶段,本文引入通道注意力机制来自适应调节不同通道的权重,从而减少无关信息对模型的干扰并过滤背景噪声,以提升模型的泛化能力。在DTU数据集上,本文算法的完整度和整体度分别为0.286和0.334,与基准算法CasMVSNet相比,分别提升了25.71%和5.92%,与其他的state-of-the-art(SOTA)算法相比,在复杂场景中重建点云的结构也更加完整。在Tanks and Temples中级数据集上,重建点云综合指标F-score为61.49,这表明本文算法具有更好的鲁棒性和泛化能力。
Aiming at the problems of poor reconstruction completeness and insufficient generalization ability of multi-view stereo reconstruction in complex scenes such as uneven illumination,weak texture,and non-Lambertian surfaces,this paper proposes a multi-view stereo reconstruction algorithm based on the attention mechanism.In the feature extraction stage,the algorithm adopts a multi-scale feature extraction module based on depth-separable convolution and self-attention mechanism,which enhances the spatial feature relationships among multiple views while expanding the sensory field,thus improving the network′s ability to characterize features in complex scenes to achieve more accurate feature matching.In the cost volume regularization stage,this paper introduces the channel attention mechanism to adaptively adjust the weights of different channels,so as to reduce the interference of irrelevant information on the model and filter the background noise to improve the generalization ability of the model.On the DTU dataset,the completeness and overall metrics of this paper′s algorithm are 0.286 and 0.334,respectively,which are improved by 25.71%and 5.92%compared to the benchmark algorithm CasMVSNet.The structure of the reconstructed point cloud is also more complete in complex scenes compared to other state-of-the-art(SOTA)algorithms.On the Tanks and Temples intermediate dataset,the reconstructed point cloud composite index F-score is 61.49,indicating that the algorithm in this paper has better robustness and generalization ability.
作者
朱代先
巩若琳
孔浩然
刘树林
Zhu Daixian;Gong Ruolin;Kong Haoran;Liu Shulin(School of Communication and Information Engineering,Xi’an University of Science and Technology,Xi’an 710054,China;School of Electrical and Control Engineering,Xi’an University of Science and Technology,Xi’an 710054,China)
出处
《电子测量技术》
北大核心
2024年第16期130-138,共9页
Electronic Measurement Technology
基金
陕西省重点研发计划项目(2021GY-338)
西安市碑林区2023年应用技术研发储备工程项目(GX2333)资助。
关键词
三维重建
多视图立体
注意力机制
3D reconstruction
multi-view stereo
attention mechanism