Previous multi-view 3D human pose estimation methods neither correlate different human joints in each view nor model learnable correlations between the same joints in different views explicitly,meaning that skeleton s...Previous multi-view 3D human pose estimation methods neither correlate different human joints in each view nor model learnable correlations between the same joints in different views explicitly,meaning that skeleton structure information is not utilized and multi-view pose information is not completely fused.Moreover,existing graph convolutional operations do not consider the specificity of different joints and different views of pose information when processing skeleton graphs,making the correlation weights between nodes in the graph and their neighborhood nodes shared.Existing Graph Convolutional Networks(GCNs)cannot extract global and deeplevel skeleton structure information and view correlations efficiently.To solve these problems,pre-estimated multiview 2D poses are designed as a multi-view skeleton graph to fuse skeleton priors and view correlations explicitly to process occlusion problem,with the skeleton-edge and symmetry-edge representing the structure correlations between adjacent joints in each viewof skeleton graph and the view-edge representing the view correlations between the same joints in different views.To make graph convolution operation mine elaborate and sufficient skeleton structure information and view correlations,different correlation weights are assigned to different categories of neighborhood nodes and further assigned to each node in the graph.Based on the graph convolution operation proposed above,a Residual Graph Convolution(RGC)module is designed as the basic module to be combined with the simplified Hourglass architecture to construct the Hourglass-GCN as our 3D pose estimation network.Hourglass-GCNwith a symmetrical and concise architecture processes three scales ofmulti-viewskeleton graphs to extract local-to-global scale and shallow-to-deep level skeleton features efficiently.Experimental results on common large 3D pose dataset Human3.6M and MPI-INF-3DHP show that Hourglass-GCN outperforms some excellent methods in 3D pose estimation accuracy.展开更多
基金supported in part by the National Natural Science Foundation of China under Grants 61973065,U20A20197,61973063.
文摘Previous multi-view 3D human pose estimation methods neither correlate different human joints in each view nor model learnable correlations between the same joints in different views explicitly,meaning that skeleton structure information is not utilized and multi-view pose information is not completely fused.Moreover,existing graph convolutional operations do not consider the specificity of different joints and different views of pose information when processing skeleton graphs,making the correlation weights between nodes in the graph and their neighborhood nodes shared.Existing Graph Convolutional Networks(GCNs)cannot extract global and deeplevel skeleton structure information and view correlations efficiently.To solve these problems,pre-estimated multiview 2D poses are designed as a multi-view skeleton graph to fuse skeleton priors and view correlations explicitly to process occlusion problem,with the skeleton-edge and symmetry-edge representing the structure correlations between adjacent joints in each viewof skeleton graph and the view-edge representing the view correlations between the same joints in different views.To make graph convolution operation mine elaborate and sufficient skeleton structure information and view correlations,different correlation weights are assigned to different categories of neighborhood nodes and further assigned to each node in the graph.Based on the graph convolution operation proposed above,a Residual Graph Convolution(RGC)module is designed as the basic module to be combined with the simplified Hourglass architecture to construct the Hourglass-GCN as our 3D pose estimation network.Hourglass-GCNwith a symmetrical and concise architecture processes three scales ofmulti-viewskeleton graphs to extract local-to-global scale and shallow-to-deep level skeleton features efficiently.Experimental results on common large 3D pose dataset Human3.6M and MPI-INF-3DHP show that Hourglass-GCN outperforms some excellent methods in 3D pose estimation accuracy.