期刊文献+
共找到179篇文章
< 1 2 9 >
每页显示 20 50 100
Monocular Depth Estimation with Sharp Boundary
1
作者 Xin Yang Qingling Chang +2 位作者 Shiting Xu Xinlin Liu Yan Cui 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第7期573-592,共20页
Monocular depth estimation is the basic task in computer vision.Its accuracy has tremendous improvement in the decade with the development of deep learning.However,the blurry boundary in the depth map is a serious pro... Monocular depth estimation is the basic task in computer vision.Its accuracy has tremendous improvement in the decade with the development of deep learning.However,the blurry boundary in the depth map is a serious problem.Researchers find that the blurry boundary is mainly caused by two factors.First,the low-level features,containing boundary and structure information,may be lost in deep networks during the convolution process.Second,themodel ignores the errors introduced by the boundary area due to the few portions of the boundary area in the whole area,during the backpropagation.Focusing on the factors mentioned above.Two countermeasures are proposed to mitigate the boundary blur problem.Firstly,we design a scene understanding module and scale transformmodule to build a lightweight fuse feature pyramid,which can deal with low-level feature loss effectively.Secondly,we propose a boundary-aware depth loss function to pay attention to the effects of the boundary’s depth value.Extensive experiments show that our method can predict the depth maps with clearer boundaries,and the performance of the depth accuracy based on NYU-Depth V2,SUN RGB-D,and iBims-1 are competitive. 展开更多
关键词 monocular depth estimation object boundary blurry boundary scene global information feature fusion scale transform boundary aware
下载PDF
Boosting Unsupervised Monocular Depth Estimation with Auxiliary Semantic Information
2
作者 Hui Ren Nan Gao Jia Li 《China Communications》 SCIE CSCD 2021年第6期228-243,共16页
Learning-based multi-task models have been widely used in various scene understanding tasks,and complement each other,i.e.,they allow us to consider prior semantic information to better infer depth.We boost the unsupe... Learning-based multi-task models have been widely used in various scene understanding tasks,and complement each other,i.e.,they allow us to consider prior semantic information to better infer depth.We boost the unsupervised monocular depth estimation using semantic segmentation as an auxiliary task.To address the lack of cross-domain datasets and catastrophic forgetting problems encountered in multi-task training,we utilize existing methodology to obtain redundant segmentation maps to build our cross-domain dataset,which not only provides a new way to conduct multi-task training,but also helps us to evaluate results compared with those of other algorithms.In addition,in order to comprehensively use the extracted features of the two tasks in the early perception stage,we use a strategy of sharing weights in the network to fuse cross-domain features,and introduce a novel multi-task loss function to further smooth the depth values.Extensive experiments on KITTI and Cityscapes datasets show that our method has achieved state-of-the-art performance in the depth estimation task,as well improved semantic segmentation. 展开更多
关键词 unsupervised monocular depth estimation semantic segmentation multi-task model
下载PDF
RADepthNet:Reflectance-aware monocular depth estimation
3
作者 Chuxuan LI Ran YI +5 位作者 Saba Ghazanfar ALI Lizhuang MA Enhua WU Jihong WANG Lijuan MAO Bin SHENG 《Virtual Reality & Intelligent Hardware》 2022年第5期418-431,共14页
Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods dire... Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy,which leads to inferior performance.Methods To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy,we propose RADepthNet,a novel reflectance-guided network that fuses boundary features.Specifically,our method predicts depth maps using the following three steps:(1)Intrinsic Image Decomposition.We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance.Through an ablation study,we demonstrate that the module can reduce the influence of illumination on depth estimation.(2)Boundary Detection.A boundary extraction module,consisting of an encoder,refinement block,and upsample block,was proposed to better predict the depth at object boundaries utilizing gradient constraints.(3)Depth Prediction Module.We use an encoder different from(2)to obtain depth features from the reflectance map and fuse boundary features to predict depth.In addition,we proposed FIFADataset,a depth-estimation dataset applied in soccer scenarios.Results Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance. 展开更多
关键词 monocular depth estimation Deep learning Intrinsic image decomposition
下载PDF
Monocular depth estimation based on deep learning: An overview 被引量:24
4
作者 ZHAO ChaoQiang SUN QiYu +2 位作者 ZHANG ChongZhen TANG Yang QIAN Feng 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2020年第9期1612-1627,共16页
Depth information is important for autonomous systems to perceive environments and estimate their own state. Traditional depth estimation methods, like structure from motion and stereo vision matching, are built on fe... Depth information is important for autonomous systems to perceive environments and estimate their own state. Traditional depth estimation methods, like structure from motion and stereo vision matching, are built on feature correspondences of multiple viewpoints. Meanwhile, the predicted depth maps are sparse. Inferring depth information from a single image(monocular depth estimation) is an ill-posed problem. With the rapid development of deep neural networks, monocular depth estimation based on deep learning has been widely studied recently and achieved promising performance in accuracy. Meanwhile, dense depth maps are estimated from single images by deep neural networks in an end-to-end manner. In order to improve the accuracy of depth estimation, different kinds of network frameworks, loss functions and training strategies are proposed subsequently. Therefore, we survey the current monocular depth estimation methods based on deep learning in this review. Initially, we conclude several widely used datasets and evaluation indicators in deep learning-based depth estimation. Furthermore, we review some representative existing methods according to different training manners: supervised, unsupervised and semi-supervised. Finally, we discuss the challenges and provide some ideas for future researches in monocular depth estimation. 展开更多
关键词 autonomous systems monocular depth estimation deep learning unsupervised learning
原文传递
DepthFormer:Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation 被引量:1
5
作者 Zhenyu Li Zehui Chen +1 位作者 Xianming Liu Junjun Jiang 《Machine Intelligence Research》 EI CSCD 2023年第6期837-854,共18页
This paper aims to address the problem of supervised monocular depth estimation.We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation.Moreover... This paper aims to address the problem of supervised monocular depth estimation.We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation.Moreover,the Transformer and convolution are good at long-range and close-range depth estimation,respectively.Therefore,we propose to adopt a parallel encoder architecture consisting of a Transformer branch and a convolution branch.The former can model global context with the effective attention mechanism and the latter aims to preserve the local information as the Transformer lacks the spatial inductive bias in modeling such contents.However,independent branches lead to a shortage of connections between features.To bridge this gap,we design a hierarchical aggregation and heterogeneous interaction module to enhance the Transformer features and model the affinity between the heterogeneous features in a set-to-set translation manner.Due to the unbearable memory cost introduced by the global attention on high-resolution feature maps,we adopt the deformable scheme to reduce the complexity.Extensive experiments on the KITTI,NYU,and SUN RGB-D datasets demonstrate that our proposed model,termed DepthFormer,surpasses state-of-the-art monocular depth estimation methods with prominent margins.The effectiveness of each proposed module is elaborately evaluated through meticulous and intensive ablation studies. 展开更多
关键词 Autonomous driving 3D reconstruction monocular depth estimation TRANSFORMER CONVOLUTION
原文传递
ArthroNet:a monocular depth estimation technique with 3D segmented maps for knee arthroscopy 被引量:1
6
作者 Shahnewaz Ali Ajay K.Pandey 《Intelligent Medicine》 CSCD 2023年第2期129-138,共10页
Background Lack of depth perception from medical imaging systems is one of the long-standing technological limitations of minimally invasive surgeries.The ability to visualize anatomical structures in 3D can improve c... Background Lack of depth perception from medical imaging systems is one of the long-standing technological limitations of minimally invasive surgeries.The ability to visualize anatomical structures in 3D can improve conventional arthroscopic surgeries,as a full 3D semantic representation of the surgical site can directly improve surgeons’ability.It also brings the possibility of intraoperative image registration with preoperative clinical records for the development of semi-autonomous,and fully autonomous platforms.This study aimed to present a novel monocular depth prediction model to infer depth maps from a single-color arthroscopic video frame.Methods We applied a novel technique that provides the ability to combine both supervised and self-supervised loss terms and thus eliminate the drawback of each technique.It enabled the estimation of edge-preserving depth maps from a single untextured arthroscopic frame.The proposed image acquisition technique projected artificial textures on the surface to improve the quality of disparity maps from stereo images.Moreover,following the integration of the attention-ware multi-scale feature extraction technique along with scene global contextual constraints and multiscale depth fusion,the model could predict reliable and accurate tissue depth of the surgical sites that complies with scene geometry.Results A total of 4,128 stereo frames from a knee phantom were used to train a network,and during the pre-trained stage,the network learned disparity maps from the stereo images.The fine-tuned training phase uses 12,695 knee arthroscopic stereo frames from cadaver experiments along with their corresponding coarse disparity maps obtained from the stereo matching technique.In a supervised fashion,the network learns the left image to the disparity map transformation process,whereas the self-supervised loss term refines the coarse depth map by minimizing reprojection,gradients,and structural dissimilarity loss.Together,our method produces high-quality 3D maps with minimum re-projection loss that are 0.0004132(structural similarity index),0.00036120156(L1 error distance)and 6.591908×10^(−5)(L1 gradient error distance).Conclusion Machine learning techniques for monocular depth prediction is studied to infer accurate depth maps from a single-color arthroscopic video frame.Moreover,the study integrates segmentation model hence,3D segmented maps are inferred that provides extended perception ability and tissue awareness. 展开更多
关键词 monocular depth estimation technique 3D segmented maps Knee arthroscopic
原文传递
Self-Supervised Monocular Depth Estimation by Digging into Uncertainty Quantification
7
作者 李远珍 郑圣杰 +3 位作者 谭梓欣 曹拓 罗飞 肖春霞 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第3期510-525,共16页
Based on well-designed network architectures and objective functions,self-supervised monocular depth estimation has made great progress.However,lacking a specific mechanism to make the network learn more about the reg... Based on well-designed network architectures and objective functions,self-supervised monocular depth estimation has made great progress.However,lacking a specific mechanism to make the network learn more about the regions containing moving objects or occlusion scenarios,existing depth estimation methods likely produce poor results for them.Therefore,we propose an uncertainty quantification method to improve the performance of existing depth estimation networks without changing their architectures.Our uncertainty quantification method consists of uncertainty measurement,the learning guidance by uncertainty,and the ultimate adaptive determination.Firstly,with Snapshot and Siam learning strategies,we measure the uncertainty degree by calculating the variance of pre-converged epochs or twins during training.Secondly,we use the uncertainty to guide the network to strengthen learning about those regions with more uncertainty.Finally,we use the uncertainty to adaptively produce the final depth estimation results with a balance of accuracy and robustness.To demonstrate the effectiveness of our uncertainty quantification method,we apply it to two state-of-the-art models,Monodepth2 and Hints.Experimental results show that our method has improved the depth estimation performance in seven evaluation metrics compared with two baseline models and exceeded the existing uncertainty method. 展开更多
关键词 self-supervised monocular depth estimation uncertainty quantification variance
原文传递
Self-supervised coarse-to-fine monocular depth estimation using a lightweight attention module
8
作者 Yuanzhen Li Fei Luo Chunxia Xiao 《Computational Visual Media》 SCIE EI CSCD 2022年第4期631-647,共17页
Self-supervised monocular depth estimation has been widely investigated and applied in previous works.However,existing methods suffer from texture-copy,depth drift,and incomplete structure.It is difficult for normal C... Self-supervised monocular depth estimation has been widely investigated and applied in previous works.However,existing methods suffer from texture-copy,depth drift,and incomplete structure.It is difficult for normal CNN networks to completely understand the relationship between the object and its surrounding environment.Moreover,it is hard to design the depth smoothness loss to balance depth smoothness and sharpness.To address these issues,we propose a coarse-to-fine method with a normalized convolutional block attention module(NCBAM).In the coarse estimation stage,we incorporate the NCBAM into depth and pose networks to overcome the texture-copy and depth drift problems.Then,we use a new network to refine the coarse depth guided by the color image and produce a structure-preserving depth result in the refinement stage.Our method can produce results competitive with state-of-the-art methods.Comprehensive experiments prove the effectiveness of our two-stage method using the NCBAM. 展开更多
关键词 monocular depth estimation texture copy depth drift attention module
原文传递
基于改进FeatDepth的足球运动场景无监督单目图像深度预测
9
作者 傅荟璇 徐权文 王宇超 《实验技术与管理》 CAS 北大核心 2024年第10期74-84,共11页
为了在降低成本的同时提高图像深度信息预测的精确度,并将深度估计应用于足球运动场景,提出一种基于改进FeatDepth的足球运动场景无监督单目图像深度预测方法。首先,对原FeatDepth引入注意力机制,使模型更加关注有效的特征信息;其次,将F... 为了在降低成本的同时提高图像深度信息预测的精确度,并将深度估计应用于足球运动场景,提出一种基于改进FeatDepth的足球运动场景无监督单目图像深度预测方法。首先,对原FeatDepth引入注意力机制,使模型更加关注有效的特征信息;其次,将FeatDepth中的PoseNet网络和DepthNet网络分别嵌入GAM全局注意力机制模块,为网络添加额外的上下文信息,在基本不增加计算成本的情况下提升FeatDepth模型深度预测性能;再次,为在低纹理区域和细节上获得更好的深度预测效果,由单视图重构损失与交叉视图重构损失组合而成最终的损失函数。选取KITTI数据集中Person场景较多的部分进行数据集制作并进行仿真实验,结果表明,改进后的FeatDepth模型不仅在精确度上有所提升,且在低纹理区域及细节处拥有更好的深度预测效果。最后,对比模型在足球场景下的推理效果后得出,改进后的模型在低纹理区域(足球、球门等)及细节处(肢体等)有更好的深度预测效果,实现了将基于无监督的单目深度估计模型应用于足球运动场景的目的。 展开更多
关键词 足球运动场景 无监督单目深度估计 Featdepth 注意力机制 GAM 图像重构
下载PDF
基于Shuffle-ZoeDepth单目深度估计的苗期玉米株高测量方法
10
作者 赵永杰 蒲六如 +2 位作者 宋磊 刘佳辉 宋怀波 《农业机械学报》 EI CAS CSCD 北大核心 2024年第5期235-243,253,共10页
株高是鉴别玉米种质性状及作物活力的重要表型指标,苗期玉米遗传特性表现明显,准确测量苗期玉米植株高度对玉米遗传特性鉴别与田间管理具有重要意义。针对传统植株高度获取方法依赖人工测量,费时费力且存在主观误差的问题,提出了一种融... 株高是鉴别玉米种质性状及作物活力的重要表型指标,苗期玉米遗传特性表现明显,准确测量苗期玉米植株高度对玉米遗传特性鉴别与田间管理具有重要意义。针对传统植株高度获取方法依赖人工测量,费时费力且存在主观误差的问题,提出了一种融合混合注意力信息的改进ZoeDepth单目深度估计模型。改进后的模型将Shuffle Attention模块加入Decoder模块的4个阶段,使Decoder模块在对低分辨率特征图信息提取过程中能更关注特征图中的有效信息,提升了模型关键信息的提取能力,可生成更精确的深度图。为验证本研究方法的有效性,在NYU-V2深度数据集上进行了验证。结果表明,改进的Shuffle-ZoeDepth模型在NYU-V2深度数据集上绝对相对差、均方根误差、对数均方根误差为0.083、0.301 mm、0.036,不同阈值下准确率分别为93.9%、99.1%、99.8%,均优于ZoeDepth模型。同时,利用Shuffle-ZoeDepth单目深度估计模型结合玉米植株高度测量模型实现了苗期玉米植株高度的测量,采集不同距离下苗期玉米图像进行植株高度测量试验。当玉米高度在15~25 cm、25~35 cm、35~45 cm 3个区间时,平均测量绝对误差分别为1.41、2.21、2.08 cm,平均测量百分比误差分别为8.41%、7.54%、4.98%。试验结果表明该方法可仅使用单个RGB相机完成复杂室外环境下苗期玉米植株高度的精确测量。 展开更多
关键词 苗期玉米 株高 单目深度估计 测量方法 混合注意力机制
下载PDF
Fusion of color and hallucinated depth features for enhanced multimodal deep learning-based damage segmentation
11
作者 Tarutal Ghosh Mondal Mohammad Reza Jahanshahi 《Earthquake Engineering and Engineering Vibration》 SCIE EI CSCD 2023年第1期55-68,共14页
Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside th... Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside the advantages,depth-sensing also presents many practical challenges.For instance,the depth sensors impose an additional payload burden on the robotic inspection platforms limiting the operation time and increasing the inspection cost.Additionally,some lidar-based depth sensors have poor outdoor performance due to sunlight contamination during the daytime.In this context,this study investigates the feasibility of abolishing depth-sensing at test time without compromising the segmentation performance.An autonomous damage segmentation framework is developed,based on recent advancements in vision-based multi-modal sensing such as modality hallucination(MH)and monocular depth estimation(MDE),which require depth data only during the model training.At the time of deployment,depth data becomes expendable as it can be simulated from the corresponding RGB frames.This makes it possible to reap the benefits of depth fusion without any depth perception per se.This study explored two different depth encoding techniques and three different fusion strategies in addition to a baseline RGB-based model.The proposed approach is validated on computer-generated RGB-D data of reinforced concrete buildings subjected to seismic damage.It was observed that the surrogate techniques can increase the segmentation IoU by up to 20.1%with a negligible increase in the computation cost.Overall,this study is believed to make a positive contribution to enhancing the resilience of critical civil infrastructure. 展开更多
关键词 multimodal data fusion depth sensing vision-based inspection UAV-assisted inspection damage segmentation post-disaster reconnaissance modality hallucination monocular depth estimation
下载PDF
走廊场景下辅助视觉里程计初始化的单目深度恢复方法
12
作者 徐晓苏 刘烨豪 +3 位作者 姚逸卿 夏若炎 王子健 范明泽 《中国惯性技术学报》 EI CSCD 北大核心 2024年第8期753-761,共9页
单目相机由于缺少尺度信息在视觉里程计等应用场景中性能受限。现有研究大多通过基于深度学习的方法解决这一问题,但其推理速度慢,难以实时运行。针对这一问题,提出了一种走廊场景下基于非线性优化进行快速单目深度恢复的显式方法。采... 单目相机由于缺少尺度信息在视觉里程计等应用场景中性能受限。现有研究大多通过基于深度学习的方法解决这一问题,但其推理速度慢,难以实时运行。针对这一问题,提出了一种走廊场景下基于非线性优化进行快速单目深度恢复的显式方法。采用虚拟相机假设,简化对相机姿态角的求解;通过最小化几何残差,将深度估计问题转换为优化问题;设计一种深度平面构建方法,对空间点深度进行分类,实现走廊等封闭结构场景下的快速深度估计;最后,将所提方法在单目视觉里程计初始化中进行应用,使得单目视觉里程计可以获得真实的尺度信息,并提升其定位精度。实验结果表明:所提方法在走廊场景3m范围内深度估计的相对误差小于8.4%,在Intel Core i5-7300HQCPU处理器中能以20FPS的速度实时运行。 展开更多
关键词 视觉里程计 单目深度估计 深度恢复 非线性优化
下载PDF
自适应多尺度特征融合的单目图像深度估计
13
作者 陈国军 付云鹏 +1 位作者 于丽香 崔涛 《计算机系统应用》 2024年第7期121-128,共8页
在基于深度学习的单目图像深度估计方法中,卷积神经网络在下采样过程中会出现图像深度信息丢失的情况,导致物体边缘深度估计效果不佳.提出一种多尺度特征融合的方法,并采用自适应融合的策略,根据特征数据动态调整不同尺度特征图的融合比... 在基于深度学习的单目图像深度估计方法中,卷积神经网络在下采样过程中会出现图像深度信息丢失的情况,导致物体边缘深度估计效果不佳.提出一种多尺度特征融合的方法,并采用自适应融合的策略,根据特征数据动态调整不同尺度特征图的融合比例,实现对多尺度特征信息的充分利用.由于空洞空间金字塔池化(ASPP)在单目深度估计任务中,会丢失图像中的像素点信息,影响小物体的预测结果.通过在对深层特征图使用ASPP时融合浅层特征图的丰富特征信息,提高深度估计结果.在NYU-DepthV2室内场景数据集的实验结果表明,本文所提方法在物体边缘处有更准确的预测,并且对小物体的预测有明显的提升,均方根误差(RMSE)达到0.389,准确率(δ<1.25)达到0.897,验证了方法的有效性. 展开更多
关键词 单目图像 深度估计 卷积神经网络 多尺度特征
下载PDF
Monocular Vision Based Boundary Avoidance for Non-Invasive Stray Control System for Cattle: A Conceptual Approach
14
作者 Adeniran Ishola Oluwaranti Seun Ayeni 《Journal of Sensor Technology》 2015年第3期63-71,共9页
Building fences to manage the cattle grazing can be very expensive;cost inefficient. These do not provide dynamic control over the area in which the cattle are grazing. Existing virtual fencing techniques for the cont... Building fences to manage the cattle grazing can be very expensive;cost inefficient. These do not provide dynamic control over the area in which the cattle are grazing. Existing virtual fencing techniques for the control of herds of cattle, based on polygon coordinate definition of boundaries is limited in the area of land mass coverage and dynamism. This work seeks to develop a more robust and an improved monocular vision based boundary avoidance for non-invasive stray control system for cattle, with a view to increase land mass coverage in virtual fencing techniques and dynamism. The monocular vision based depth estimation will be modeled using concept of global Fourier Transform (FT) and local Wavelet Transform (WT) of image structure of scenes (boundaries). The magnitude of the global Fourier Transform gives the dominant orientations and textual patterns of the image;while the local Wavelet Transform gives the dominant spectral features of the image and their spatial distribution. Each scene picture or image is defined by features v, which contain the set of global (FT) and local (WT) statistics of the image. Scenes or boundaries distances are given by estimating the depth D by means of the image features v. Sound cues of intensity equivalent to the magnitude of the depth D are applied to the animal ears as stimuli. This brings about the desired control as animals tend to move away from uncomfortable sounds. 展开更多
关键词 monocular Vision Control Systems Global POSITIONING System Wireless Sensor Networks depth Estimation
下载PDF
基于多尺度深度图自适应融合的单目深度估计 被引量:1
15
作者 郑游 王磊 杨紫文 《武汉工程大学学报》 CAS 2024年第1期85-90,共6页
深度估计网络通常具有较多的网络层数,图像特征在网络编码和解码过程中会丢失大量信息,因此预测的深度图缺乏对象结构细节且边缘轮廓不清晰。本文提出了一种基于多尺度深度图自适应融合的单目深度估计方法,可有效保留对象的细节和几何... 深度估计网络通常具有较多的网络层数,图像特征在网络编码和解码过程中会丢失大量信息,因此预测的深度图缺乏对象结构细节且边缘轮廓不清晰。本文提出了一种基于多尺度深度图自适应融合的单目深度估计方法,可有效保留对象的细节和几何轮廓。首先,引入压缩与激励残差网络(SE-ResNet),利用注意力机制对不同通道的特征进行编码,从而保留远距离平面深度图的更多细节信息。然后,利用多尺度特征融合网络,融合不同尺度的特征图,得到具有丰富几何特征和语义信息的特征图。最后,利用多尺度自适应深度融合网络为不同尺度特征图生成的深度图添加可学习的权重参数,对不同尺度的深度图进行自适应融合,增加了预测深度图中的目标信息。本文方法在NYU Depth V2数据集上预测的深度图具有更高的准确度和丰富的物体信息,绝对相对误差为0.115,均方根误差为0.525,精确度最高达到99.3%。 展开更多
关键词 单目深度估计 注意力机制 多尺度特征融合网络 多尺度深度自适应融合网络
下载PDF
基于语义辅助和深度时序一致性约束的自监督单目深度估计
16
作者 凌传武 陈华 +1 位作者 徐大勇 张小刚 《湖南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2024年第8期1-12,共12页
通过使用相邻帧之间的光度一致性损失代替深度标签作为网络训练的监督信号,基于图像序列训练的自监督单目深度估计方法近年来受到了广泛的关注.光度一致性约束遵循了静态世界假设,而单目图像序列中存在的运动目标违反该假设,进而影响自... 通过使用相邻帧之间的光度一致性损失代替深度标签作为网络训练的监督信号,基于图像序列训练的自监督单目深度估计方法近年来受到了广泛的关注.光度一致性约束遵循了静态世界假设,而单目图像序列中存在的运动目标违反该假设,进而影响自监督训练过程中相机位姿估计精度和光度损失函数的计算精度.通过检测并移除运动目标区域,可在得到与目标运动解耦的相机位姿的同时,消除运动目标区域对光度损失计算精度的影响.为此,本文提出了一种基于语义辅助和深度时序一致性约束的自监督单目深度估计网络.首先,使用离线的实例分割网络检测可能违反静态世界假设的动态类别目标,并移除对应区域输入位姿网络从而得到与物体运动解耦的相机位姿.其次,基于语义一致性和光度一致性约束,检测动态类别目标的运动状态,使得运动区域的光度损失不影响网络参数的迭代更新.最后,在非运动区域施加深度时序一致性约束,显式对齐当前帧的估计深度值与相邻帧的投影深度值,进一步细化深度预测结果.在KITTI、DDAD以及KITTI Odometry数据集上的实验验证了所提方法与以往的自监督单目深度估计方法相比具有更出色的性能表现. 展开更多
关键词 单目深度估计 自监督学习 运动目标 时序一致性
下载PDF
融合单目深度和 RTK 定位的电力线弧垂测量方法
17
作者 郭嘉琪 景超 +3 位作者 李雪薇 王慧民 张兴忠 程永强 《电子测量技术》 北大核心 2024年第2期89-97,共9页
现有电力线路弧垂测量方法操作繁琐且智能化程度低,提出一种融合单目深度和RTK定位的电力线弧垂测量方法。首先,无人机拍摄电力线路关键点图像,将其输入构建的单目深度估计模型EleDep-Net生成对应深度图,该模型嵌入带状金字塔模块和边... 现有电力线路弧垂测量方法操作繁琐且智能化程度低,提出一种融合单目深度和RTK定位的电力线弧垂测量方法。首先,无人机拍摄电力线路关键点图像,将其输入构建的单目深度估计模型EleDep-Net生成对应深度图,该模型嵌入带状金字塔模块和边界融合注意力模块,使模型能精准地捕捉导线上下文语义信息;其次,引入深度矫正算法进一步修正深度图中的深度值,根据修正后的深度值得到关键点深度信息;最后,结合无人机RTK定位和关键点深度信息,在参考坐标系中生成关键点空间坐标进而拟合出导线抛物线公式,通过该公式计算出导线弧垂。在配网线路真实环境中进行测试,结果表明本方法在保证测量相对误差小于5%的前提下,作业效率明显提高,有较高的工程应用价值。 展开更多
关键词 单目深度估计 RTK定位 深度图 抛物线模型 弧垂测量
下载PDF
基于边缘强化的无监督单目深度估计 被引量:1
18
作者 曲熠 陈莹 《系统工程与电子技术》 EI CSCD 北大核心 2024年第1期71-79,共9页
为解决无监督单目深度估计边缘深度估计不准确的问题,提出了一种基于边缘强化的无监督单目深度估计网络模型。该模型由单视图深度网络和姿态网络两部分构成,均采用编解码结构,其中单视图深度网络编码器使用高分辨率网络(high-resolution... 为解决无监督单目深度估计边缘深度估计不准确的问题,提出了一种基于边缘强化的无监督单目深度估计网络模型。该模型由单视图深度网络和姿态网络两部分构成,均采用编解码结构,其中单视图深度网络编码器使用高分辨率网络(high-resolution net,HRNet)作为骨干网络,在整个过程中保持高分辨率表示,有利于提取精确空间特征;单视图深度网络解码器引入条状卷积,细化深度边缘附近的深度变化,同时利用经典的高斯拉普拉斯算子增强边缘细节,最终充分利用深度边缘信息提高深度估计质量。在KITTI数据集中进行的实验结果表明:所提模型具有较好的深度估计性能,能够使深度图中的目标边缘更加清晰,细节更加丰富。 展开更多
关键词 单目深度估计 无监督学习 条状卷积 边缘增强
下载PDF
无监督单目深度估计研究综述
19
作者 蔡嘉诚 董方敏 +1 位作者 孙水发 汤永恒 《计算机科学》 CSCD 北大核心 2024年第2期117-134,共18页
深度估计作为三维重建、自动驾驶和视觉SLAM等领域中的关键环节,一直是计算机视觉领域研究的热点方向,其中无监督学习的单目深度估计技术由于具有方便部署、计算成本低等优点,受到了学术界和工业界的广泛关注。首先梳理了深度估计的基... 深度估计作为三维重建、自动驾驶和视觉SLAM等领域中的关键环节,一直是计算机视觉领域研究的热点方向,其中无监督学习的单目深度估计技术由于具有方便部署、计算成本低等优点,受到了学术界和工业界的广泛关注。首先梳理了深度估计的基本知识及研究现状,简要介绍了基于参数学习、基于非参数学习、基于有监督学习、基于半监督学习和基于无监督学习的深度估计的优势与不足;其次全面总结了基于无监督学习的单目深度估计研究进展,按照结合可解释性掩膜、结合视觉里程计、结合先验辅助信息、结合生成式对抗网络和实时轻量级网络这五大类对无监督学习的单目深度估计进行归纳和总结,对典型的框架模型进行了介绍和分析;然后,介绍了基于无监督学习的单目深度估计在医学、自动驾驶、农业、军事等领域的应用;最后,简单介绍了用于无监督深度估计的常用数据集,提出了基于无监督学习的单目深度估计未来研究方向,并对这个快速发展领域中的各方向研究进行了展望。 展开更多
关键词 计算机视觉 深度学习 无监督学习 单目深度估计
下载PDF
面向全局特征Transformer架构的单目深度估计
20
作者 吴冰源 王永雄 《控制工程》 CSCD 北大核心 2024年第9期1619-1625,共7页
针对卷积神经网络(convolutional neural networks,CNN)全局特征提取不足导致深度估计错误的问题,提出了一种面向全局特征的深度学习网络用于单目深度估计。该网络采用编码器-解码器的端到端架构,其中,编码器为具有多阶段输出的Transfor... 针对卷积神经网络(convolutional neural networks,CNN)全局特征提取不足导致深度估计错误的问题,提出了一种面向全局特征的深度学习网络用于单目深度估计。该网络采用编码器-解码器的端到端架构,其中,编码器为具有多阶段输出的Transformer网络,可提取多尺度的全局特征;解码器由CNN构成。此外,为抑制深度无关的细节信息影响,解码器末端采用了大卷积核注意力(large kernel attention,LKA)模块提升全局特征的提取能力。在室外场景数据集KITTI和室内场景数据集NYU Depth v2上的实验结果表明,面向全局特征的网络有助于生成高精度的、细节特征完整的深度图。与近期提出的同样基于CNN-Transformer的方法 AdaBins相比,所提出网络的参数量减少了42.31%,均方根误差减小了约2%。 展开更多
关键词 单目深度估计 TRANSFORMER 大卷积核注意力 全局特征
下载PDF
上一页 1 2 9 下一页 到第
使用帮助 返回顶部