In recent years,addressing ill-posed problems by leveraging prior knowledge contained in databases on learning techniques has gained much attention.In this paper,we focus on complete three-dimensional(3D)point cloud r...In recent years,addressing ill-posed problems by leveraging prior knowledge contained in databases on learning techniques has gained much attention.In this paper,we focus on complete three-dimensional(3D)point cloud reconstruction based on a single red-green-blue(RGB)image,a task that cannot be approached using classical reconstruction techniques.For this purpose,we used an encoder-decoder framework to encode the RGB information in latent space,and to predict the 3D structure of the considered object from different viewpoints.The individual predictions are combined to yield a common representation that is used in a module combining camera pose estimation and rendering,thereby achieving differentiability with respect to imaging process and the camera pose,and optimization of the two-dimensional prediction error of novel viewpoints.Thus,our method allows end-to-end training and does not require supervision based on additional ground-truth(GT)mask annotations or ground-truth camera pose annotations.Our evaluation of synthetic and real-world data demonstrates the robustness of our approach to appearance changes and self-occlusions,through outperformance of current state-of-the-art methods in terms of accuracy,density,and model completeness.展开更多
Highly scattering media,such as milk,skin,and clouds,are common in the real world.Rendering participating media is challenging,especially for highorder scattering dominant media,because the light may undergo a large n...Highly scattering media,such as milk,skin,and clouds,are common in the real world.Rendering participating media is challenging,especially for highorder scattering dominant media,because the light may undergo a large number of scattering events before leaving the surface.Monte Carlo-based methods typically require a long time to produce noise-free results.Based on the observation that low-albedo media contain less noise than high-albedo media,we propose reducing the variance of the rendered results using differentiable regularization.We first render an image with low-albedo participating media together with the gradient with respect to the albedo,and then predict the final rendered image with a low-albedo image and gradient image via a novel prediction function.To achieve high quality,we also consider the gradients of neighboring frames to provide a noise-free gradient image.Ultimately,our method can produce results with much less overall eror than equal-time path tracing methods.展开更多
Learning and inferring underlying motion patterns of captured 2D scenes and then re-creating dynamic evolution consistent with the real-world natural phenomena have high appeal for graphics and animation.To bridge the...Learning and inferring underlying motion patterns of captured 2D scenes and then re-creating dynamic evolution consistent with the real-world natural phenomena have high appeal for graphics and animation.To bridge the technical gap between virtual and real environments,we focus on the inverse modeling and reconstruction of visually consistent and property-verifiable oceans,taking advantage of deep learning and differentiable physics to learn geometry and constitute waves in a self-supervised manner.First,we infer hierarchical geometry using two networks,which are optimized via the differentiable renderer.We extract wave components from the sequence of inferred geometry through a network equipped with a differentiable ocean model.Then,ocean dynamics can be evolved using the reconstructed wave components.Through extensive experiments,we verify that our new method yields satisfactory results for both geometry reconstruction and wave estimation.Moreover,the new framework has the inverse modeling potential to facilitate a host of graphics applications,such as the rapid production of physically accurate scene animation and editing guided by real ocean scenes.展开更多
Reconstructing 3D digital models of humans from sensory data is a long-standing problem in computer vision and graphics with a variety of applications in VR/AR,film production,and human–computer interaction,etc.While...Reconstructing 3D digital models of humans from sensory data is a long-standing problem in computer vision and graphics with a variety of applications in VR/AR,film production,and human–computer interaction,etc.While a huge amount of effort has been devoted to developing various capture hardware and reconstruction algorithms,traditional reconstruction pipelines may still suffer from high-cost capture systems and tedious capture processes,which prevent them from being easily accessible.Moreover,the dedicatedly hand-crafted pipelines are prone to reconstruction artifacts,resulting in limited visual quality.To solve these challenges,the recent trend in this area is to use deep neural networks to improve reconstruction efficiency and robustness by learning human priors from existing data.Neural network-based implicit functions have been also shown to be a favorable 3D representation compared to traditional forms like meshes and voxels.Furthermore,neural rendering has emerged as a powerful tool to achieve highly photorealistic modeling and re-rendering of humans by end-to-end optimizing the visual quality of output images.In this article,we will briefly review these advances in this fast-developing field,discuss the advantages and limitations of different approaches,and finally,share some thoughts on future research directions.展开更多
基金Supported by National Natural Science Foundation of China(Grant No.51935003).
文摘In recent years,addressing ill-posed problems by leveraging prior knowledge contained in databases on learning techniques has gained much attention.In this paper,we focus on complete three-dimensional(3D)point cloud reconstruction based on a single red-green-blue(RGB)image,a task that cannot be approached using classical reconstruction techniques.For this purpose,we used an encoder-decoder framework to encode the RGB information in latent space,and to predict the 3D structure of the considered object from different viewpoints.The individual predictions are combined to yield a common representation that is used in a module combining camera pose estimation and rendering,thereby achieving differentiability with respect to imaging process and the camera pose,and optimization of the two-dimensional prediction error of novel viewpoints.Thus,our method allows end-to-end training and does not require supervision based on additional ground-truth(GT)mask annotations or ground-truth camera pose annotations.Our evaluation of synthetic and real-world data demonstrates the robustness of our approach to appearance changes and self-occlusions,through outperformance of current state-of-the-art methods in terms of accuracy,density,and model completeness.
基金supported by the National Natural Science Foundation of China under Grant No.62172220。
文摘Highly scattering media,such as milk,skin,and clouds,are common in the real world.Rendering participating media is challenging,especially for highorder scattering dominant media,because the light may undergo a large number of scattering events before leaving the surface.Monte Carlo-based methods typically require a long time to produce noise-free results.Based on the observation that low-albedo media contain less noise than high-albedo media,we propose reducing the variance of the rendered results using differentiable regularization.We first render an image with low-albedo participating media together with the gradient with respect to the albedo,and then predict the final rendered image with a low-albedo image and gradient image via a novel prediction function.To achieve high quality,we also consider the gradients of neighboring frames to provide a noise-free gradient image.Ultimately,our method can produce results with much less overall eror than equal-time path tracing methods.
基金sponsored by grants from the National Natural Science Foundation of China(62002010,61872347)the CAMS Innovation Fund for Medical Sciences(2019-I2M5-016)the Special Plan for the Development of Distinguished Young Scientists of ISCAS(Y8RC535018).
文摘Learning and inferring underlying motion patterns of captured 2D scenes and then re-creating dynamic evolution consistent with the real-world natural phenomena have high appeal for graphics and animation.To bridge the technical gap between virtual and real environments,we focus on the inverse modeling and reconstruction of visually consistent and property-verifiable oceans,taking advantage of deep learning and differentiable physics to learn geometry and constitute waves in a self-supervised manner.First,we infer hierarchical geometry using two networks,which are optimized via the differentiable renderer.We extract wave components from the sequence of inferred geometry through a network equipped with a differentiable ocean model.Then,ocean dynamics can be evolved using the reconstructed wave components.Through extensive experiments,we verify that our new method yields satisfactory results for both geometry reconstruction and wave estimation.Moreover,the new framework has the inverse modeling potential to facilitate a host of graphics applications,such as the rapid production of physically accurate scene animation and editing guided by real ocean scenes.
基金The authors would like to acknowledge the support from NSFC(No.62172364).
文摘Reconstructing 3D digital models of humans from sensory data is a long-standing problem in computer vision and graphics with a variety of applications in VR/AR,film production,and human–computer interaction,etc.While a huge amount of effort has been devoted to developing various capture hardware and reconstruction algorithms,traditional reconstruction pipelines may still suffer from high-cost capture systems and tedious capture processes,which prevent them from being easily accessible.Moreover,the dedicatedly hand-crafted pipelines are prone to reconstruction artifacts,resulting in limited visual quality.To solve these challenges,the recent trend in this area is to use deep neural networks to improve reconstruction efficiency and robustness by learning human priors from existing data.Neural network-based implicit functions have been also shown to be a favorable 3D representation compared to traditional forms like meshes and voxels.Furthermore,neural rendering has emerged as a powerful tool to achieve highly photorealistic modeling and re-rendering of humans by end-to-end optimizing the visual quality of output images.In this article,we will briefly review these advances in this fast-developing field,discuss the advantages and limitations of different approaches,and finally,share some thoughts on future research directions.