Recovering three-dimensional (3D) human pose sequence from arbitrary view is very difficult, due to loss of depth information and self-occlusion. In this paper, view-independent 3D-key-pose set is selected from 3D a...Recovering three-dimensional (3D) human pose sequence from arbitrary view is very difficult, due to loss of depth information and self-occlusion. In this paper, view-independent 3D-key-pose set is selected from 3D action samples, for the purpose of representing and recognizing those same actions from a single or few cameras without any restriction of the relative orientations between cameras and subjects. First, 3D-key-pose set is selected from the 3D human joint sequences of 3D training action samples that are built from multiple viewpoints. Second, 3D key pose sequence, which matches best with the observation sequence, is selected from the 3D-key- pose set to represent the observation sequence of arbitrary view. 3D key pose sequence contains many discriminative view-independent key poses but cannot accurately describe pose of every frame in the observation sequence. Considering the above reasons, pose and dynamic of action are modeled respectively in this paper. Exemplar-based embedding and probability of unique key pose are applied to model pose property. Complementary dynamic feature is extracted to model these actions that share the same poses but have different dynamic features. Finally, these action models are fused to recognize observation sequence from a single or few cameras. Effectiveness of the proposed approach is demonstrated with experiments on IXMAS dataset.展开更多
文摘Recovering three-dimensional (3D) human pose sequence from arbitrary view is very difficult, due to loss of depth information and self-occlusion. In this paper, view-independent 3D-key-pose set is selected from 3D action samples, for the purpose of representing and recognizing those same actions from a single or few cameras without any restriction of the relative orientations between cameras and subjects. First, 3D-key-pose set is selected from the 3D human joint sequences of 3D training action samples that are built from multiple viewpoints. Second, 3D key pose sequence, which matches best with the observation sequence, is selected from the 3D-key- pose set to represent the observation sequence of arbitrary view. 3D key pose sequence contains many discriminative view-independent key poses but cannot accurately describe pose of every frame in the observation sequence. Considering the above reasons, pose and dynamic of action are modeled respectively in this paper. Exemplar-based embedding and probability of unique key pose are applied to model pose property. Complementary dynamic feature is extracted to model these actions that share the same poses but have different dynamic features. Finally, these action models are fused to recognize observation sequence from a single or few cameras. Effectiveness of the proposed approach is demonstrated with experiments on IXMAS dataset.