Intrinsic image decomposition is an important and long-standing computer vision problem.Given an input image,recovering the physical scene properties is ill-posed.Several physically motivated priors have been used to ...Intrinsic image decomposition is an important and long-standing computer vision problem.Given an input image,recovering the physical scene properties is ill-posed.Several physically motivated priors have been used to restrict the solution space of the optimization problem for intrinsic image decomposition.This work takes advantage of deep learning,and shows that it can solve this challenging computer vision problem with high efficiency.The focus lies in the feature encoding phase to extract discriminative features for different intrinsic layers from an input image.To achieve this goal,we explore the distinctive characteristics of different intrinsic components in the high-dimensional feature embedding space.We define feature distribution divergence to efficiently separate the feature vectors of different intrinsic components.The feature distributions are also constrained to fit the real ones through a feature distribution consistency.In addition,a data refinement approach is provided to remove data inconsistency from the Sintel dataset,making it more suitable for intrinsic image decomposition.Our method is also extended to intrinsic video decomposition based on pixel-wise correspondences between adjacent frames.Experimental results indicate that our proposed network structure can outperform the existing state-of-the-art.展开更多
Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods dire...Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy,which leads to inferior performance.Methods To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy,we propose RADepthNet,a novel reflectance-guided network that fuses boundary features.Specifically,our method predicts depth maps using the following three steps:(1)Intrinsic Image Decomposition.We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance.Through an ablation study,we demonstrate that the module can reduce the influence of illumination on depth estimation.(2)Boundary Detection.A boundary extraction module,consisting of an encoder,refinement block,and upsample block,was proposed to better predict the depth at object boundaries utilizing gradient constraints.(3)Depth Prediction Module.We use an encoder different from(2)to obtain depth features from the reflectance map and fuse boundary features to predict depth.In addition,we proposed FIFADataset,a depth-estimation dataset applied in soccer scenarios.Results Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.展开更多
We propose a novel interactive lighting editing system for lighting a single indoor RGB image based on spherical harmonic lighting.It allows users to intuitively edit illumination and relight the complicated low-light...We propose a novel interactive lighting editing system for lighting a single indoor RGB image based on spherical harmonic lighting.It allows users to intuitively edit illumination and relight the complicated low-light indoor scene.Our method not only achieves plausible global relighting but also enhances the local details of the complicated scene according to the spatially-varying spherical harmonic lighting,which only requires a single RGB image along with a corresponding depth map.To this end,we first present a joint optimization algorithm,which is based on the geometric optimization of the depth map and intrinsic image decomposition avoiding texture-copy,for refining the depth map and obtaining the shading map.Then we propose a lighting estimation method based on spherical harmonic lighting,which not only achieves the global illumination estimation of the scene,but also further enhances local details of the complicated scene.Finally,we use a simple and intuitive interactive method to edit the environment lighting map to adjust lighting and relight the scene.Through extensive experimental results,we demonstrate that our proposed approach is simple and intuitive for relighting the low-light indoor scene,and achieve state-of-the-art results.展开更多
基金supported by the Special Funds for Creative Research(Grant No.2022C61540)the National Natural Science Foundation of China(NSFC,Grant Nos.61972012 and 61732016).
文摘Intrinsic image decomposition is an important and long-standing computer vision problem.Given an input image,recovering the physical scene properties is ill-posed.Several physically motivated priors have been used to restrict the solution space of the optimization problem for intrinsic image decomposition.This work takes advantage of deep learning,and shows that it can solve this challenging computer vision problem with high efficiency.The focus lies in the feature encoding phase to extract discriminative features for different intrinsic layers from an input image.To achieve this goal,we explore the distinctive characteristics of different intrinsic components in the high-dimensional feature embedding space.We define feature distribution divergence to efficiently separate the feature vectors of different intrinsic components.The feature distributions are also constrained to fit the real ones through a feature distribution consistency.In addition,a data refinement approach is provided to remove data inconsistency from the Sintel dataset,making it more suitable for intrinsic image decomposition.Our method is also extended to intrinsic video decomposition based on pixel-wise correspondences between adjacent frames.Experimental results indicate that our proposed network structure can outperform the existing state-of-the-art.
基金Supported by the National Natural Science Foundation of China under Grants 61872241, 62077037 and 62077037Shanghai Municipal Science and Technology Major Project under Grant 2021SHZDZX0102。
文摘Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy,which leads to inferior performance.Methods To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy,we propose RADepthNet,a novel reflectance-guided network that fuses boundary features.Specifically,our method predicts depth maps using the following three steps:(1)Intrinsic Image Decomposition.We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance.Through an ablation study,we demonstrate that the module can reduce the influence of illumination on depth estimation.(2)Boundary Detection.A boundary extraction module,consisting of an encoder,refinement block,and upsample block,was proposed to better predict the depth at object boundaries utilizing gradient constraints.(3)Depth Prediction Module.We use an encoder different from(2)to obtain depth features from the reflectance map and fuse boundary features to predict depth.In addition,we proposed FIFADataset,a depth-estimation dataset applied in soccer scenarios.Results Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.
基金supported by NSFC(No.61972298)Bingtuan Science and Technology Program(No.2019BC008).
文摘We propose a novel interactive lighting editing system for lighting a single indoor RGB image based on spherical harmonic lighting.It allows users to intuitively edit illumination and relight the complicated low-light indoor scene.Our method not only achieves plausible global relighting but also enhances the local details of the complicated scene according to the spatially-varying spherical harmonic lighting,which only requires a single RGB image along with a corresponding depth map.To this end,we first present a joint optimization algorithm,which is based on the geometric optimization of the depth map and intrinsic image decomposition avoiding texture-copy,for refining the depth map and obtaining the shading map.Then we propose a lighting estimation method based on spherical harmonic lighting,which not only achieves the global illumination estimation of the scene,but also further enhances local details of the complicated scene.Finally,we use a simple and intuitive interactive method to edit the environment lighting map to adjust lighting and relight the scene.Through extensive experimental results,we demonstrate that our proposed approach is simple and intuitive for relighting the low-light indoor scene,and achieve state-of-the-art results.