The aim is to reconstruct a complete and detailed clothed human from a single-view input.Implicit function is suitable for this task because it represents fine shape details and varied topology.Current methods,however...The aim is to reconstruct a complete and detailed clothed human from a single-view input.Implicit function is suitable for this task because it represents fine shape details and varied topology.Current methods,however,often suffer from artefacts such as broken or disembodied body parts,missing details,or depth ambiguity due to the ambiguity and complexity of human articulation.The main issue observed by the authors is structureagnostic.To address these problems,the authors fully utilise the skinned multi-person linear(SMPL)model and propose a method using the Skeleton-aware Implicit Function(SIF).To alleviate the broken or disembodied body parts,the proposed skeleton-aware structure prior makes the skeleton awareness into an implicit function,which consists of a bone-guided sampling strategy and a skeleton-relative encoding strategy.To deal with the missing details and depth ambiguity problems,the authors’body-guided pixel-aligned feature exploits the SMPL to enhance 2D normal and depth semantic features,and the proposed feature aggregation uses the extra geometry-aware prior to enabling a more plausible merging with less noisy geometry.Additionally,SIF is also adapted to the RGB-D input,and experimental results show that SIF outperforms the state-of-the-arts methods on challenging datasets from Twindom and Thuman3.0.展开更多
Reconstructing 3D digital models of humans from sensory data is a long-standing problem in computer vision and graphics with a variety of applications in VR/AR,film production,and human–computer interaction,etc.While...Reconstructing 3D digital models of humans from sensory data is a long-standing problem in computer vision and graphics with a variety of applications in VR/AR,film production,and human–computer interaction,etc.While a huge amount of effort has been devoted to developing various capture hardware and reconstruction algorithms,traditional reconstruction pipelines may still suffer from high-cost capture systems and tedious capture processes,which prevent them from being easily accessible.Moreover,the dedicatedly hand-crafted pipelines are prone to reconstruction artifacts,resulting in limited visual quality.To solve these challenges,the recent trend in this area is to use deep neural networks to improve reconstruction efficiency and robustness by learning human priors from existing data.Neural network-based implicit functions have been also shown to be a favorable 3D representation compared to traditional forms like meshes and voxels.Furthermore,neural rendering has emerged as a powerful tool to achieve highly photorealistic modeling and re-rendering of humans by end-to-end optimizing the visual quality of output images.In this article,we will briefly review these advances in this fast-developing field,discuss the advantages and limitations of different approaches,and finally,share some thoughts on future research directions.展开更多
基金National Key R&D Program of China,Grant/Award Number:2022YFF0901902。
文摘The aim is to reconstruct a complete and detailed clothed human from a single-view input.Implicit function is suitable for this task because it represents fine shape details and varied topology.Current methods,however,often suffer from artefacts such as broken or disembodied body parts,missing details,or depth ambiguity due to the ambiguity and complexity of human articulation.The main issue observed by the authors is structureagnostic.To address these problems,the authors fully utilise the skinned multi-person linear(SMPL)model and propose a method using the Skeleton-aware Implicit Function(SIF).To alleviate the broken or disembodied body parts,the proposed skeleton-aware structure prior makes the skeleton awareness into an implicit function,which consists of a bone-guided sampling strategy and a skeleton-relative encoding strategy.To deal with the missing details and depth ambiguity problems,the authors’body-guided pixel-aligned feature exploits the SMPL to enhance 2D normal and depth semantic features,and the proposed feature aggregation uses the extra geometry-aware prior to enabling a more plausible merging with less noisy geometry.Additionally,SIF is also adapted to the RGB-D input,and experimental results show that SIF outperforms the state-of-the-arts methods on challenging datasets from Twindom and Thuman3.0.
基金The authors would like to acknowledge the support from NSFC(No.62172364).
文摘Reconstructing 3D digital models of humans from sensory data is a long-standing problem in computer vision and graphics with a variety of applications in VR/AR,film production,and human–computer interaction,etc.While a huge amount of effort has been devoted to developing various capture hardware and reconstruction algorithms,traditional reconstruction pipelines may still suffer from high-cost capture systems and tedious capture processes,which prevent them from being easily accessible.Moreover,the dedicatedly hand-crafted pipelines are prone to reconstruction artifacts,resulting in limited visual quality.To solve these challenges,the recent trend in this area is to use deep neural networks to improve reconstruction efficiency and robustness by learning human priors from existing data.Neural network-based implicit functions have been also shown to be a favorable 3D representation compared to traditional forms like meshes and voxels.Furthermore,neural rendering has emerged as a powerful tool to achieve highly photorealistic modeling and re-rendering of humans by end-to-end optimizing the visual quality of output images.In this article,we will briefly review these advances in this fast-developing field,discuss the advantages and limitations of different approaches,and finally,share some thoughts on future research directions.