期刊文献+
共找到26篇文章
< 1 2 >
每页显示 20 50 100
Monocular Depth Estimation with Sharp Boundary
1
作者 Xin Yang Qingling Chang +2 位作者 Shiting Xu Xinlin Liu Yan Cui 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第7期573-592,共20页
Monocular depth estimation is the basic task in computer vision.Its accuracy has tremendous improvement in the decade with the development of deep learning.However,the blurry boundary in the depth map is a serious pro... Monocular depth estimation is the basic task in computer vision.Its accuracy has tremendous improvement in the decade with the development of deep learning.However,the blurry boundary in the depth map is a serious problem.Researchers find that the blurry boundary is mainly caused by two factors.First,the low-level features,containing boundary and structure information,may be lost in deep networks during the convolution process.Second,themodel ignores the errors introduced by the boundary area due to the few portions of the boundary area in the whole area,during the backpropagation.Focusing on the factors mentioned above.Two countermeasures are proposed to mitigate the boundary blur problem.Firstly,we design a scene understanding module and scale transformmodule to build a lightweight fuse feature pyramid,which can deal with low-level feature loss effectively.Secondly,we propose a boundary-aware depth loss function to pay attention to the effects of the boundary’s depth value.Extensive experiments show that our method can predict the depth maps with clearer boundaries,and the performance of the depth accuracy based on NYU-Depth V2,SUN RGB-D,and iBims-1 are competitive. 展开更多
关键词 Monocular depth estimation object boundary blurry boundary scene global information feature fusion scale transform boundary aware
下载PDF
A Simple Method for Source Depth Estimation with Multi-path Time Delay in Deep Ocean 被引量:1
2
作者 杨坤德 杨秋龙 +1 位作者 郭晓乐 曹然 《Chinese Physics Letters》 SCIE CAS CSCD 2016年第12期86-90,共5页
A method of source depth estimation based on the multi-path time delay difference is proposed. When the minimum time arrivals in all receiver depths are snapped to a certain time on time delay-depth plane, time delay ... A method of source depth estimation based on the multi-path time delay difference is proposed. When the minimum time arrivals in all receiver depths are snapped to a certain time on time delay-depth plane, time delay arrivals of surface-bottom reflection and bottom-surface reflection intersect at the source depth. Two hydrophones deployed vertically with a certain interval are required at least. If the receiver depths are known, the pair of time delays can be used to estimate the source depth. With the proposed method the source depth can be estimated successfully in a moderate range in the deep ocean without complicated matched-field calculations in the simulations and experiments. 展开更多
关键词 of on with A Simple Method for Source depth estimation with Multi-path Time Delay in Deep Ocean for in IS SOURCE
下载PDF
Depth estimation system suitable for hardware design
3
作者 李贺建 左一帆 +3 位作者 杨高波 安平 王建伟 滕国伟 《Journal of Shanghai University(English Edition)》 CAS 2011年第4期325-330,共6页
Depth estimation is an active research area with the developing of stereo vision in recent years. It is one of the key technologies to resolve the large data of stereo vision communication. Now depth estimation still ... Depth estimation is an active research area with the developing of stereo vision in recent years. It is one of the key technologies to resolve the large data of stereo vision communication. Now depth estimation still has some problems, such as occlusion, fuzzy edge, real-time processing, etc. Many algorithms have been proposed base on software, however the performance of the computer configurations limits the software processing speed. The other resolution is hardware design and the great developments of the digital signal processor (DSP), and application specific integrated circuit (ASIC) and field programmable gate array (FPGA) provide the opportunity of flexible applications. In this work, by analyzing the procedures of depth estimation, the proper algorithms which can be used in hardware design to execute real-time depth estimation are proposed. The different methods of calibration, matching and post-processing are analyzed based on the hardware design requirements. At last some tests for the algorithm have been analyzed. The results show that the algorithms proposed for hardware design can provide credited depth map for further view synthesis and are suitable for hardware design. 展开更多
关键词 3-D TV (3DTV) depth estimation hardware design rank transform census transform
下载PDF
Boosting Unsupervised Monocular Depth Estimation with Auxiliary Semantic Information
4
作者 Hui Ren Nan Gao Jia Li 《China Communications》 SCIE CSCD 2021年第6期228-243,共16页
Learning-based multi-task models have been widely used in various scene understanding tasks,and complement each other,i.e.,they allow us to consider prior semantic information to better infer depth.We boost the unsupe... Learning-based multi-task models have been widely used in various scene understanding tasks,and complement each other,i.e.,they allow us to consider prior semantic information to better infer depth.We boost the unsupervised monocular depth estimation using semantic segmentation as an auxiliary task.To address the lack of cross-domain datasets and catastrophic forgetting problems encountered in multi-task training,we utilize existing methodology to obtain redundant segmentation maps to build our cross-domain dataset,which not only provides a new way to conduct multi-task training,but also helps us to evaluate results compared with those of other algorithms.In addition,in order to comprehensively use the extracted features of the two tasks in the early perception stage,we use a strategy of sharing weights in the network to fuse cross-domain features,and introduce a novel multi-task loss function to further smooth the depth values.Extensive experiments on KITTI and Cityscapes datasets show that our method has achieved state-of-the-art performance in the depth estimation task,as well improved semantic segmentation. 展开更多
关键词 unsupervised monocular depth estimation semantic segmentation multi-task model
下载PDF
Depth Estimation from a Single Image Based on Cauchy Distribution Model
5
作者 Ying Ming 《Journal of Computer and Communications》 2021年第3期133-142,共10页
Most approaches to estimate a scene’s 3D depth from a single image often model the point spread function (PSF) as a 2D Gaussian function. However, those method<span>s</span><span> are suffered ... Most approaches to estimate a scene’s 3D depth from a single image often model the point spread function (PSF) as a 2D Gaussian function. However, those method<span>s</span><span> are suffered from some noises, and difficult to get a high quality of depth recovery. We presented a simple yet effective approach to estimate exactly the amount of spatially varying defocus blur at edges, based on </span><span>a</span><span> Cauchy distribution model for the PSF. The raw image was re-blurred twice using two known Cauchy distribution kernels, and the defocus blur amount at edges could be derived from the gradient ratio between the two re-blurred images. By propagating the blur amount at edge locations to the entire image using the matting interpolation, a full depth map was then recovered. Experimental results on several real images demonstrated both feasibility and effectiveness of our method, being a non-Gaussian model for DSF, in providing a better estimation of the defocus map from a single un-calibrated defocused image. These results also showed that our method </span><span>was</span><span> robust to image noises, inaccurate edge location and interferences of neighboring edges. It could generate more accurate scene depth maps than the most of existing methods using a Gaussian based DSF model.</span> 展开更多
关键词 depth estimation depth From Defocus Defocus Blur Gaussian Gradient Cauchy Distribution Point Spread Function (PSF)
下载PDF
RADepthNet:Reflectance-aware monocular depth estimation
6
作者 Chuxuan LI Ran YI +5 位作者 Saba Ghazanfar ALI Lizhuang MA Enhua WU Jihong WANG Lijuan MAO Bin SHENG 《Virtual Reality & Intelligent Hardware》 2022年第5期418-431,共14页
Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods dire... Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy,which leads to inferior performance.Methods To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy,we propose RADepthNet,a novel reflectance-guided network that fuses boundary features.Specifically,our method predicts depth maps using the following three steps:(1)Intrinsic Image Decomposition.We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance.Through an ablation study,we demonstrate that the module can reduce the influence of illumination on depth estimation.(2)Boundary Detection.A boundary extraction module,consisting of an encoder,refinement block,and upsample block,was proposed to better predict the depth at object boundaries utilizing gradient constraints.(3)Depth Prediction Module.We use an encoder different from(2)to obtain depth features from the reflectance map and fuse boundary features to predict depth.In addition,we proposed FIFADataset,a depth-estimation dataset applied in soccer scenarios.Results Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance. 展开更多
关键词 Monocular depth estimation Deep learning Intrinsic image decomposition
下载PDF
On Robust Cross-view Consistency in Self-supervised Monocular Depth Estimation
7
作者 Haimei Zhao Jing Zhang +2 位作者 Zhuo Chen Bo Yuan Dacheng Tao 《Machine Intelligence Research》 EI CSCD 2024年第3期495-513,共19页
Remarkable progress has been made in self-supervised monocular depth estimation (SS-MDE) by exploring cross-view consistency, e.g., photometric consistency and 3D point cloud consistency. However, they are very vulner... Remarkable progress has been made in self-supervised monocular depth estimation (SS-MDE) by exploring cross-view consistency, e.g., photometric consistency and 3D point cloud consistency. However, they are very vulnerable to illumination variance, occlusions, texture-less regions, as well as moving objects, making them not robust enough to deal with various scenes. To address this challenge, we study two kinds of robust cross-view consistency in this paper. Firstly, the spatial offset field between adjacent frames is obtained by reconstructing the reference frame from its neighbors via deformable alignment, which is used to align the temporal depth features via a depth feature alignment (DFA) loss. Secondly, the 3D point clouds of each reference frame and its nearby frames are calculated and transformed into voxel space, where the point density in each voxel is calculated and aligned via a voxel density alignment (VDA) loss. In this way, we exploit the temporal coherence in both depth feature space and 3D voxel space for SS-MDE, shifting the “point-to-point” alignment paradigm to the “region-to-region” one. Compared with the photometric consistency loss as well as the rigid point cloud alignment loss, the proposed DFA and VDA losses are more robust owing to the strong representation power of deep features as well as the high tolerance of voxel density to the aforementioned challenges. Experimental results on several outdoor benchmarks show that our method outperforms current state-of-the-art techniques. Extensive ablation study and analysis validate the effectiveness of the proposed losses, especially in challenging scenes. The code and models are available at https://github.com/sunnyHelen/RCVC-depth. 展开更多
关键词 3D vision depth estimation cross-view consistency self-supervised learning monocular perception.
原文传递
DepthFormer:Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation 被引量:1
8
作者 Zhenyu Li Zehui Chen +1 位作者 Xianming Liu Junjun Jiang 《Machine Intelligence Research》 EI CSCD 2023年第6期837-854,共18页
This paper aims to address the problem of supervised monocular depth estimation.We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation.Moreover... This paper aims to address the problem of supervised monocular depth estimation.We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation.Moreover,the Transformer and convolution are good at long-range and close-range depth estimation,respectively.Therefore,we propose to adopt a parallel encoder architecture consisting of a Transformer branch and a convolution branch.The former can model global context with the effective attention mechanism and the latter aims to preserve the local information as the Transformer lacks the spatial inductive bias in modeling such contents.However,independent branches lead to a shortage of connections between features.To bridge this gap,we design a hierarchical aggregation and heterogeneous interaction module to enhance the Transformer features and model the affinity between the heterogeneous features in a set-to-set translation manner.Due to the unbearable memory cost introduced by the global attention on high-resolution feature maps,we adopt the deformable scheme to reduce the complexity.Extensive experiments on the KITTI,NYU,and SUN RGB-D datasets demonstrate that our proposed model,termed DepthFormer,surpasses state-of-the-art monocular depth estimation methods with prominent margins.The effectiveness of each proposed module is elaborately evaluated through meticulous and intensive ablation studies. 展开更多
关键词 Autonomous driving 3D reconstruction monocular depth estimation TRANSFORMER CONVOLUTION
原文传递
ArthroNet:a monocular depth estimation technique with 3D segmented maps for knee arthroscopy 被引量:1
9
作者 Shahnewaz Ali Ajay K.Pandey 《Intelligent Medicine》 CSCD 2023年第2期129-138,共10页
Background Lack of depth perception from medical imaging systems is one of the long-standing technological limitations of minimally invasive surgeries.The ability to visualize anatomical structures in 3D can improve c... Background Lack of depth perception from medical imaging systems is one of the long-standing technological limitations of minimally invasive surgeries.The ability to visualize anatomical structures in 3D can improve conventional arthroscopic surgeries,as a full 3D semantic representation of the surgical site can directly improve surgeons’ability.It also brings the possibility of intraoperative image registration with preoperative clinical records for the development of semi-autonomous,and fully autonomous platforms.This study aimed to present a novel monocular depth prediction model to infer depth maps from a single-color arthroscopic video frame.Methods We applied a novel technique that provides the ability to combine both supervised and self-supervised loss terms and thus eliminate the drawback of each technique.It enabled the estimation of edge-preserving depth maps from a single untextured arthroscopic frame.The proposed image acquisition technique projected artificial textures on the surface to improve the quality of disparity maps from stereo images.Moreover,following the integration of the attention-ware multi-scale feature extraction technique along with scene global contextual constraints and multiscale depth fusion,the model could predict reliable and accurate tissue depth of the surgical sites that complies with scene geometry.Results A total of 4,128 stereo frames from a knee phantom were used to train a network,and during the pre-trained stage,the network learned disparity maps from the stereo images.The fine-tuned training phase uses 12,695 knee arthroscopic stereo frames from cadaver experiments along with their corresponding coarse disparity maps obtained from the stereo matching technique.In a supervised fashion,the network learns the left image to the disparity map transformation process,whereas the self-supervised loss term refines the coarse depth map by minimizing reprojection,gradients,and structural dissimilarity loss.Together,our method produces high-quality 3D maps with minimum re-projection loss that are 0.0004132(structural similarity index),0.00036120156(L1 error distance)and 6.591908×10^(−5)(L1 gradient error distance).Conclusion Machine learning techniques for monocular depth prediction is studied to infer accurate depth maps from a single-color arthroscopic video frame.Moreover,the study integrates segmentation model hence,3D segmented maps are inferred that provides extended perception ability and tissue awareness. 展开更多
关键词 Monocular depth estimation technique 3D segmented maps Knee arthroscopic
原文传递
Self-Supervised Monocular Depth Estimation by Digging into Uncertainty Quantification
10
作者 李远珍 郑圣杰 +3 位作者 谭梓欣 曹拓 罗飞 肖春霞 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第3期510-525,共16页
Based on well-designed network architectures and objective functions,self-supervised monocular depth estimation has made great progress.However,lacking a specific mechanism to make the network learn more about the reg... Based on well-designed network architectures and objective functions,self-supervised monocular depth estimation has made great progress.However,lacking a specific mechanism to make the network learn more about the regions containing moving objects or occlusion scenarios,existing depth estimation methods likely produce poor results for them.Therefore,we propose an uncertainty quantification method to improve the performance of existing depth estimation networks without changing their architectures.Our uncertainty quantification method consists of uncertainty measurement,the learning guidance by uncertainty,and the ultimate adaptive determination.Firstly,with Snapshot and Siam learning strategies,we measure the uncertainty degree by calculating the variance of pre-converged epochs or twins during training.Secondly,we use the uncertainty to guide the network to strengthen learning about those regions with more uncertainty.Finally,we use the uncertainty to adaptively produce the final depth estimation results with a balance of accuracy and robustness.To demonstrate the effectiveness of our uncertainty quantification method,we apply it to two state-of-the-art models,Monodepth2 and Hints.Experimental results show that our method has improved the depth estimation performance in seven evaluation metrics compared with two baseline models and exceeded the existing uncertainty method. 展开更多
关键词 self-supervised monocular depth estimation uncertainty quantification variance
原文传递
Depth Estimation Based on Monocular Camera Sensors in Autonomous Vehicles: A Self‑supervised Learning Approach
11
作者 Guofa Li Xingyu Chi Xingda Qu 《Automotive Innovation》 EI CSCD 2023年第2期268-280,共13页
Estimating depth from images captured by camera sensors is crucial for the advancement of autonomous driving technologies and has gained significant attention in recent years.However,most previous methods rely on stac... Estimating depth from images captured by camera sensors is crucial for the advancement of autonomous driving technologies and has gained significant attention in recent years.However,most previous methods rely on stacked pooling or stride convolution to extract high-level features,which can limit network performance and lead to information redundancy.This paper proposes an improved bidirectional feature pyramid module(BiFPN)and a channel attention module(Seblock:squeeze and excitation)to address these issues in existing methods based on monocular camera sensor.The Seblock redistributes channel feature weights to enhance useful information,while the improved BiFPN facilitates efficient fusion of multi-scale features.The proposed method is in an end-to-end solution without any additional post-processing,resulting in efficient depth estimation.Experiment results show that the proposed method is competitive with state-of-the-art algorithms and preserves fine-grained texture of scene depth. 展开更多
关键词 Autonomous vehicle Camera sensor Deep learning depth estimation Self-supervised
原文传递
Monocular depth estimation based on deep learning: An overview 被引量:15
12
作者 ZHAO ChaoQiang SUN QiYu +2 位作者 ZHANG ChongZhen TANG Yang QIAN Feng 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2020年第9期1612-1627,共16页
Depth information is important for autonomous systems to perceive environments and estimate their own state. Traditional depth estimation methods, like structure from motion and stereo vision matching, are built on fe... Depth information is important for autonomous systems to perceive environments and estimate their own state. Traditional depth estimation methods, like structure from motion and stereo vision matching, are built on feature correspondences of multiple viewpoints. Meanwhile, the predicted depth maps are sparse. Inferring depth information from a single image(monocular depth estimation) is an ill-posed problem. With the rapid development of deep neural networks, monocular depth estimation based on deep learning has been widely studied recently and achieved promising performance in accuracy. Meanwhile, dense depth maps are estimated from single images by deep neural networks in an end-to-end manner. In order to improve the accuracy of depth estimation, different kinds of network frameworks, loss functions and training strategies are proposed subsequently. Therefore, we survey the current monocular depth estimation methods based on deep learning in this review. Initially, we conclude several widely used datasets and evaluation indicators in deep learning-based depth estimation. Furthermore, we review some representative existing methods according to different training manners: supervised, unsupervised and semi-supervised. Finally, we discuss the challenges and provide some ideas for future researches in monocular depth estimation. 展开更多
关键词 autonomous systems monocular depth estimation deep learning unsupervised learning
原文传递
Semisupervised learning-based depth estimation with semantic inference guidance 被引量:1
13
作者 ZHANG Yan FAN XiaoPeng ZHAO DeBin 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2022年第5期1098-1106,共9页
Depth estimation is a fundamental computer vision problem that infers three-dimensional(3D)structures from a given scene.As it is an ill-posed problem,to fit the projection function from the given scene to the 3D stru... Depth estimation is a fundamental computer vision problem that infers three-dimensional(3D)structures from a given scene.As it is an ill-posed problem,to fit the projection function from the given scene to the 3D structure,traditional methods generally require mass amounts of annotated data.Such pixel-level annotation is quite labor consuming,especially when addressing reflective surfaces such as mirrors or water.The widespread application of deep learning further intensifies the demand for large amounts of annotated data.Therefore,it is urgent and necessary to propose a framework that is able to reduce the requirement on the amount of data.In this paper,we propose a novel semisupervised learning framework to infer the 3D structure from the given scene.First,semantic information is employed to make the depth inference more accurate.Second,we make both the depth estimation and semantic segmentation coarse-to-fine frameworks;thus,the depth estimation can be gradually guided by semantic segmentation.We compare our model with state-of-the-art methods.The experimental results demonstrate that our method is better than many supervised learning-based methods,which proves the effectiveness of the proposed method. 展开更多
关键词 depth estimation semisupervised learning semantic information neural networks
原文传递
Geophysical Study: Estimation of Deposit Depth Using Gravimetric Data and Euler Method (Jalalabad Iron Mine, Kerman Province of IRAN) 被引量:5
14
作者 Adel Shirazy Aref Shirazi +2 位作者 Hamed Nazerian Keyvan Khayer Ardeshir Hezarkhani 《Open Journal of Geology》 2021年第8期340-355,共16页
Mineral exploration is done by different methods. Geophysical and geochemical studies are two powerful tools in this field. In integrated studies, the results of each study are used to determine the location of the dr... Mineral exploration is done by different methods. Geophysical and geochemical studies are two powerful tools in this field. In integrated studies, the results of each study are used to determine the location of the drilling boreholes. The purpose of this study is to use field geophysics to calculate the depth of mineral reserve. The study area is located 38 km from Zarand city called Jalalabad iron mine. In this study, gravimetric data were measured and mineral depth was calculated using the Euler method. 1314 readings have been performed in this area. The rocks of the region include volcanic and sedimentary. The source of the mineralization in the area is hydrothermal processes. After gravity measuring in the region, the data were corrected, then various methods such as anomalous map remaining in levels one and two, upward expansion, first and second-degree vertical derivatives, analytical method, and analytical signal were drawn, and finally, the depth of the deposit was estimated by Euler method. As a result, the depth of the mineral deposit was calculated to be between 20 and 30 meters on average. 展开更多
关键词 Geophysical Study depth estimation Gravimetric Data Euler Method Jalalabad Iron Mine
下载PDF
Self-supervised coarse-to-fine monocular depth estimation using a lightweight attention module
15
作者 Yuanzhen Li Fei Luo Chunxia Xiao 《Computational Visual Media》 SCIE EI CSCD 2022年第4期631-647,共17页
Self-supervised monocular depth estimation has been widely investigated and applied in previous works.However,existing methods suffer from texture-copy,depth drift,and incomplete structure.It is difficult for normal C... Self-supervised monocular depth estimation has been widely investigated and applied in previous works.However,existing methods suffer from texture-copy,depth drift,and incomplete structure.It is difficult for normal CNN networks to completely understand the relationship between the object and its surrounding environment.Moreover,it is hard to design the depth smoothness loss to balance depth smoothness and sharpness.To address these issues,we propose a coarse-to-fine method with a normalized convolutional block attention module(NCBAM).In the coarse estimation stage,we incorporate the NCBAM into depth and pose networks to overcome the texture-copy and depth drift problems.Then,we use a new network to refine the coarse depth guided by the color image and produce a structure-preserving depth result in the refinement stage.Our method can produce results competitive with state-of-the-art methods.Comprehensive experiments prove the effectiveness of our two-stage method using the NCBAM. 展开更多
关键词 monocular depth estimation texture copy depth drift attention module
原文传递
Semi-Global Depth Estimation Algorithm for Mobile 3-D Video Applications
16
作者 Qiong Liu Hongjiang Xiao 《Tsinghua Science and Technology》 EI CAS 2012年第2期128-135,共8页
Three-dimensional (3-D) video applications, such as 3-D cinema, 3DTV, ana Free Viewpomt Video (FVV) are attracting more attention both from the industry and in the literature. High accuracy of depth video is a fun... Three-dimensional (3-D) video applications, such as 3-D cinema, 3DTV, ana Free Viewpomt Video (FVV) are attracting more attention both from the industry and in the literature. High accuracy of depth video is a fundamental prerequisite for most 3-D applications. However, accurate depth requires computationally intensive global optimization. This high computational complexity is one of the bottlenecks to applying depth generation to 3-D applications, especially for mobile networks since mobile terminals usually have limited computing ability. This paper presents a semi-global depth estimation algorithm based on temporal consis- tency, where the depth propagation is used to generate initial depth values for the computationally intensive global optimization. The accuracy of initial depth is improved by detecting and eliminating the depth propagation outliers before the global optimization. Integrating the initial values without outliers into the global optimization reduces the computational complexity while maintaining the depth accuracy. Tests demonstrate that the algorithm reduces the total computational time by 54%-65% while the quality of the virtual views is essentially equivalent to the benchmark. 展开更多
关键词 3-D video depth estimation temporal propagation graph cuts
原文传递
Depth estimation using an improved stereo network
17
作者 Wanpeng XU Ling ZOU +2 位作者 Lingda WU Yue QI Zhaoyong QIAN 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2022年第5期777-789,共13页
Self-supervised depth estimation approaches present excellent results that are comparable to those of the fully supervised approaches,by employing view synthesis between the target and reference images in the training... Self-supervised depth estimation approaches present excellent results that are comparable to those of the fully supervised approaches,by employing view synthesis between the target and reference images in the training data.ResNet,which serves as a backbone network,has some structural deficiencies when applied to downstream fields,because its original purpose was to cope with classification problems.The low-texture area also deteriorates the performance.To address these problems,we propose a set of improvements that lead to superior predictions.First,we boost the information flow in the network and improve the ability to learn spatial structures by improving the network structures.Second,we use a binary mask to remove the pixels in low-texture areas between the target and reference images to more accurately reconstruct the image.Finally,we input the target and reference images randomly to expand the dataset and pre-train it on ImageNet,so that the model obtains a favorable general feature representation.We demonstrate state-of-the-art performance on an Eigen split of the KITTI driving dataset using stereo pairs. 展开更多
关键词 Monocular depth estimation Self-supervised Image reconstruction
原文传递
Self‐supervised monocular depth estimation via asymmetric convolution block
18
作者 Lingling Hu Hao Zhang +2 位作者 Zhuping Wang Chao Huang Changzhu Zhang 《IET Cyber-Systems and Robotics》 EI 2022年第2期131-138,共8页
Without the dependence of depth ground truth,self‐supervised learning is a promising alternative to train monocular depth estimation.It builds its own supervision signal with the help of other tools,such as view synt... Without the dependence of depth ground truth,self‐supervised learning is a promising alternative to train monocular depth estimation.It builds its own supervision signal with the help of other tools,such as view synthesis and pose networks.However,more training parameters and time consumption may be involved.This paper proposes a monocular depth prediction framework that can jointly learn the depth value and pose transformation between images in an end‐to‐end manner.The depth network creatively employs an asymmetric convolution block instead of every square kernel layer to strengthen the learning ability of extracting image features when training.During infer-ence time,the asymmetric kernels are fused and converted to the original network to predict more accurate image depth,thus bringing no extra computations anymore.The network is trained and tested on the KITTI monocular dataset.The evaluated results demonstrate that the depth model outperforms some State of the Arts(SOTA)ap-proaches and can reduce the inference time of depth prediction.Additionally,the pro-posed model performs great adaptability on the Make3D dataset. 展开更多
关键词 asymmetric convolution block(ACB) KITTI dataset self‐supervised depth estimation
原文传递
Fusion of color and hallucinated depth features for enhanced multimodal deep learning-based damage segmentation
19
作者 Tarutal Ghosh Mondal Mohammad Reza Jahanshahi 《Earthquake Engineering and Engineering Vibration》 SCIE EI CSCD 2023年第1期55-68,共14页
Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside th... Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside the advantages,depth-sensing also presents many practical challenges.For instance,the depth sensors impose an additional payload burden on the robotic inspection platforms limiting the operation time and increasing the inspection cost.Additionally,some lidar-based depth sensors have poor outdoor performance due to sunlight contamination during the daytime.In this context,this study investigates the feasibility of abolishing depth-sensing at test time without compromising the segmentation performance.An autonomous damage segmentation framework is developed,based on recent advancements in vision-based multi-modal sensing such as modality hallucination(MH)and monocular depth estimation(MDE),which require depth data only during the model training.At the time of deployment,depth data becomes expendable as it can be simulated from the corresponding RGB frames.This makes it possible to reap the benefits of depth fusion without any depth perception per se.This study explored two different depth encoding techniques and three different fusion strategies in addition to a baseline RGB-based model.The proposed approach is validated on computer-generated RGB-D data of reinforced concrete buildings subjected to seismic damage.It was observed that the surrogate techniques can increase the segmentation IoU by up to 20.1%with a negligible increase in the computation cost.Overall,this study is believed to make a positive contribution to enhancing the resilience of critical civil infrastructure. 展开更多
关键词 multimodal data fusion depth sensing vision-based inspection UAV-assisted inspection damage segmentation post-disaster reconnaissance modality hallucination monocular depth estimation
下载PDF
Enhanced 3D Point Cloud Reconstruction for Light Field Microscopy Using U-Net-Based Convolutional Neural Networks
20
作者 Shariar Md Imtiaz Ki-Chul Kwon +4 位作者 F.M.Fahmid Hossain MdBiddut Hossain Rupali Kiran Shinde Sang-Keun Gil Nam Kim 《Computer Systems Science & Engineering》 SCIE EI 2023年第12期2921-2937,共17页
This article describes a novel approach for enhancing the three-dimensional(3D)point cloud reconstruction for light field microscopy(LFM)using U-net architecture-based fully convolutional neural network(CNN).Since the... This article describes a novel approach for enhancing the three-dimensional(3D)point cloud reconstruction for light field microscopy(LFM)using U-net architecture-based fully convolutional neural network(CNN).Since the directional view of the LFM is limited,noise and artifacts make it difficult to reconstruct the exact shape of 3D point clouds.The existing methods suffer from these problems due to the self-occlusion of the model.This manuscript proposes a deep fusion learning(DL)method that combines a 3D CNN with a U-Net-based model as a feature extractor.The sub-aperture images obtained from the light field microscopy are aligned to form a light field data cube for preprocessing.A multi-stream 3D CNNs and U-net architecture are applied to obtain the depth feature fromthe directional sub-aperture LF data cube.For the enhancement of the depthmap,dual iteration-based weighted median filtering(WMF)is used to reduce surface noise and enhance the accuracy of the reconstruction.Generating a 3D point cloud involves combining two key elements:the enhanced depth map and the central view of the light field image.The proposed method is validated using synthesized Heidelberg Collaboratory for Image Processing(HCI)and real-world LFM datasets.The results are compared with different state-of-the-art methods.The structural similarity index(SSIM)gain for boxes,cotton,pillow,and pens are 0.9760,0.9806,0.9940,and 0.9907,respectively.Moreover,the discrete entropy(DE)value for LFM depth maps exhibited better performance than other existing methods. 展开更多
关键词 3Dreconstruction 3Dmodeling point cloud depth estimation integral imaging light filedmicroscopy 3D-CNN U-Net deep learning machine intelligence
下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部