Depth estimation is an important task in computer vision.Collecting data at scale for monocular depth estimation is challenging,as this task requires simultaneously capturing RGB images and depth information.Therefore...Depth estimation is an important task in computer vision.Collecting data at scale for monocular depth estimation is challenging,as this task requires simultaneously capturing RGB images and depth information.Therefore,data augmentation is crucial for this task.Existing data augmentationmethods often employ pixel-wise transformations,whichmay inadvertently disrupt edge features.In this paper,we propose a data augmentationmethod formonocular depth estimation,which we refer to as the Perpendicular-Cutdepth method.This method involves cutting realworld depth maps along perpendicular directions and pasting them onto input images,thereby diversifying the data without compromising edge features.To validate the effectiveness of the algorithm,we compared it with existing convolutional neural network(CNN)against the current mainstream data augmentation algorithms.Additionally,to verify the algorithm’s applicability to Transformer networks,we designed an encoder-decoder network structure based on Transformer to assess the generalization of our proposed algorithm.Experimental results demonstrate that,in the field of monocular depth estimation,our proposed method,Perpendicular-Cutdepth,outperforms traditional data augmentationmethods.On the indoor dataset NYU,our method increases accuracy from0.900 to 0.907 and reduces the error rate from0.357 to 0.351.On the outdoor dataset KITTI,our method improves accuracy from 0.9638 to 0.9642 and decreases the error rate from 0.060 to 0.0598.展开更多
基金the Grant of Program for Scientific ResearchInnovation Team in Colleges and Universities of Anhui Province(2022AH010095)The Grant ofScientific Research and Talent Development Foundation of the Hefei University(No.21-22RC15)+2 种基金The Key Research Plan of Anhui Province(No.2022k07020011)The Grant of Anhui Provincial940 CMC,2024,vol.79,no.1Natural Science Foundation,No.2308085MF213The Open Fund of Information Materials andIntelligent Sensing Laboratory of Anhui Province IMIS202205,as well as the AI General ComputingPlatform of Hefei University.
文摘Depth estimation is an important task in computer vision.Collecting data at scale for monocular depth estimation is challenging,as this task requires simultaneously capturing RGB images and depth information.Therefore,data augmentation is crucial for this task.Existing data augmentationmethods often employ pixel-wise transformations,whichmay inadvertently disrupt edge features.In this paper,we propose a data augmentationmethod formonocular depth estimation,which we refer to as the Perpendicular-Cutdepth method.This method involves cutting realworld depth maps along perpendicular directions and pasting them onto input images,thereby diversifying the data without compromising edge features.To validate the effectiveness of the algorithm,we compared it with existing convolutional neural network(CNN)against the current mainstream data augmentation algorithms.Additionally,to verify the algorithm’s applicability to Transformer networks,we designed an encoder-decoder network structure based on Transformer to assess the generalization of our proposed algorithm.Experimental results demonstrate that,in the field of monocular depth estimation,our proposed method,Perpendicular-Cutdepth,outperforms traditional data augmentationmethods.On the indoor dataset NYU,our method increases accuracy from0.900 to 0.907 and reduces the error rate from0.357 to 0.351.On the outdoor dataset KITTI,our method improves accuracy from 0.9638 to 0.9642 and decreases the error rate from 0.060 to 0.0598.