基于注意力机制的多视图立体重建算法

Multi-view stereo reconstruction algorithm based on attention mechanism

下载PDF

导出

摘要针对多视图立体重建在光照不均匀、弱纹理、非朗伯表面等复杂场景中重建完整度差、泛化能力不足的问题,本文提出了一种基于注意力机制的多视图立体重建算法。在特征提取阶段,该算法采用基于深度可分离卷积和自注意力机制的多尺度特征提取模块,在扩大感受野的同时增强多视图间的空间特征关系,从而提升网络在复杂场景下特征的表征能力以实现更精确的特征匹配。在代价体正则化阶段,本文引入通道注意力机制来自适应调节不同通道的权重,从而减少无关信息对模型的干扰并过滤背景噪声,以提升模型的泛化能力。在DTU数据集上,本文算法的完整度和整体度分别为0.286和0.334,与基准算法CasMVSNet相比,分别提升了25.71%和5.92%,与其他的state-of-the-art(SOTA)算法相比,在复杂场景中重建点云的结构也更加完整。在Tanks and Temples中级数据集上,重建点云综合指标F-score为61.49,这表明本文算法具有更好的鲁棒性和泛化能力。 Aiming at the problems of poor reconstruction completeness and insufficient generalization ability of multi-view stereo reconstruction in complex scenes such as uneven illumination,weak texture,and non-Lambertian surfaces,this paper proposes a multi-view stereo reconstruction algorithm based on the attention mechanism.In the feature extraction stage,the algorithm adopts a multi-scale feature extraction module based on depth-separable convolution and self-attention mechanism,which enhances the spatial feature relationships among multiple views while expanding the sensory field,thus improving the network′s ability to characterize features in complex scenes to achieve more accurate feature matching.In the cost volume regularization stage,this paper introduces the channel attention mechanism to adaptively adjust the weights of different channels,so as to reduce the interference of irrelevant information on the model and filter the background noise to improve the generalization ability of the model.On the DTU dataset,the completeness and overall metrics of this paper′s algorithm are 0.286 and 0.334,respectively,which are improved by 25.71%and 5.92%compared to the benchmark algorithm CasMVSNet.The structure of the reconstructed point cloud is also more complete in complex scenes compared to other state-of-the-art(SOTA)algorithms.On the Tanks and Temples intermediate dataset,the reconstructed point cloud composite index F-score is 61.49,indicating that the algorithm in this paper has better robustness and generalization ability.

作者朱代先巩若琳孔浩然刘树林 Zhu Daixian;Gong Ruolin;Kong Haoran;Liu Shulin(School of Communication and Information Engineering,Xi’an University of Science and Technology,Xi’an 710054,China;School of Electrical and Control Engineering,Xi’an University of Science and Technology,Xi’an 710054,China)

机构地区西安科技大学通信与信息工程学院西安科技大学电气与控制工程学院

出处《电子测量技术》北大核心 2024年第16期130-138,共9页 Electronic Measurement Technology

基金陕西省重点研发计划项目(2021GY-338) 西安市碑林区2023年应用技术研发储备工程项目(GX2333)资助。

关键词三维重建多视图立体注意力机制 3D reconstruction multi-view stereo attention mechanism

分类号 TP391.4 [自动化与计算机技术—计算机应用技术] TN911.73 [电子电信—通信与信息系统]

引文网络
相关文献

1黄洋.基于深度学习的多视图立体匹配三维重建算法分析[J].电子技术（上海）,2024,53(8):302-303.
2Yaqian Guo,Xin Wang,Ce Li,Shihui Ying.Domain adaptive semantic segmentation by optimal transport[J].Fundamental Research,2024,4(5):981-991.
3ZHAO HaoTian,QIU Shi,LIU Ming,CAO XiBin.Satellite anomaly detection based on reconstruction discrepancy theory utilizing a new dual-branch reconstruction model[J].Science China(Technological Sciences),2024,67(10):3294-3307.

电子测量技术

2024年第16期

浏览历史

内容加载中请稍等...

基于注意力机制的多视图立体重建算法

相关作者

相关机构

相关主题

浏览历史