摘要
Recovering human pose from RGB images and videos has drawn increasing attention in recent years owing to minimum sensor requirements and applicability in diverse fields such as human-computer interaction,robotics,video analytics,and augmented reality.Although a large amount of work has been devoted to this field,3D human pose estimation based on monocular images or videos remains a very challenging task due to a variety of difficulties such as depth ambiguities,occlusion,background clutters,and lack of training data.In this survey,we summarize recent advances in monocular 3D human pose estimation.We provide a general taxonomy to cover existing approaches and analyze their capabilities and limitations.We also present a summary of extensively used datasets and metrics,and provide a quantitative comparison of some representative methods.Finally,we conclude with a discussion on realistic challenges and open problems for future research directions.
基金
National Natural Science Foundation of China(61806176)
the Fundamental Research Funds for the Central Universities(2019QNA5022).