In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal ...In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal graph.Most GCNs define the graph topology by physical relations of the human joints.However,this predefined graph ignores the spatial relationship between non-adjacent joint pairs in special actions and the behavior dependence between joint pairs,resulting in a low recognition rate for specific actions with implicit correlation between joint pairs.In addition,existing methods ignore the trend correlation between adjacent frames within an action and context clues,leading to erroneous action recognition with similar poses.Therefore,this study proposes a learnable GCN based on behavior dependence,which considers implicit joint correlation by constructing a dynamic learnable graph with extraction of specific behavior dependence of joint pairs.By using the weight relationship between the joint pairs,an adaptive model is constructed.It also designs a self-attention module to obtain their inter-frame topological relationship for exploring the context of actions.Combining the shared topology and the multi-head self-attention map,the module obtains the context-based clue topology to update the dynamic graph convolution,achieving accurate recognition of different actions with similar poses.Detailed experiments on public datasets demonstrate that the proposed method achieves better results and realizes higher quality representation of actions under various evaluation protocols compared to state-of-the-art methods.展开更多
基金supported in part by the 2023 Key Supported Project of the 14th Five Year Plan for Education and Science in Hunan Province with No.ND230795.
文摘In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal graph.Most GCNs define the graph topology by physical relations of the human joints.However,this predefined graph ignores the spatial relationship between non-adjacent joint pairs in special actions and the behavior dependence between joint pairs,resulting in a low recognition rate for specific actions with implicit correlation between joint pairs.In addition,existing methods ignore the trend correlation between adjacent frames within an action and context clues,leading to erroneous action recognition with similar poses.Therefore,this study proposes a learnable GCN based on behavior dependence,which considers implicit joint correlation by constructing a dynamic learnable graph with extraction of specific behavior dependence of joint pairs.By using the weight relationship between the joint pairs,an adaptive model is constructed.It also designs a self-attention module to obtain their inter-frame topological relationship for exploring the context of actions.Combining the shared topology and the multi-head self-attention map,the module obtains the context-based clue topology to update the dynamic graph convolution,achieving accurate recognition of different actions with similar poses.Detailed experiments on public datasets demonstrate that the proposed method achieves better results and realizes higher quality representation of actions under various evaluation protocols compared to state-of-the-art methods.