分段时间注意力时空图卷积网络的动作识别

Action Recognition with Segment Temporal Attention Spatiotemporal Graph Convolutional Networks

下载PDF

导出

摘要得益于图卷积网络(GCN)对于处理非欧几里得数据有着非常好的效果,同时人体的骨骼点数据相对于RGB视频数据具有更好的环境适应性和动作表达能力.因此,基于骨骼点的人体动作识别方法得到了越来越多的关注和研究.将人体骨骼建模为时空图形的数据进行基于GCN模型的动作识别取得了显著的性能提升,但是现有的基于GCN的动作识别模型往往无法捕获动作视频流中的细节特征.针对此问题,本文提出了一种基于分段时间注意力时空图卷积骨骼点动作识别方法.通过将数据的时间帧进行分段处理,提取注意力,来提高模型对细节特征的提取能力.同时引入协调注意力模块,将位置信息嵌入注意力图中,这种方法增强了模型的泛化能力.在NTU-RGBD数据集和Kinetics-Skeleton数据集上的大量实验表明,本文所提模型可以获得比目前多数文献更高的动作识别精度,有更好的识别效果. Benefit by the graph convolutional network(GCN)has very good effect on processing non-Euclidean data,and the human skeleton point data has better environmental adaptability and action expression ability than RGB video data.Therefore,human action recognition methods based on skeleton points have received more and more attention and research.Human skeletons are modeled as spatiotemporal graphics data for action recognition based on GCN models,which has achieved significant performance improvements.However,existing GCN-based action recognition models generally unable to capture detailed features in action video streams.Aiming at this problem,this paper proposes a convolutional skeletal point action recognition method based on segmental temporal attention spatiotemporal graph.According to segmenting the time frames of the data and extracting attention to improve the ability of extracting detailed features.At the same time,a coordinated attention module is introduced to embed the location information into the attention map,which enhances the generalization ability of the model.Extensive experiments on NTU-RGBD dataset and Kinetics-Skeleton dataset show that the model proposed in this paper can achieve higher action recognition accuracy and better recognition effect than most current literatures.

作者吕梦柯郭佳乐丁英强陈恩庆 LU Mengke;GUO Jiale;DING Yingqiang;CHEN Enqing(College of Electrical and Information Engineering,Zhengzhou University,Zhengzhou 450001,China;Henan Xintong Intelligent IOT Co.,Ltd.,Zhengzhou 450007,China)

机构地区郑州大学电气与信息工程学院河南信通智能物联有限公司

出处《小型微型计算机系统》 CSCD 北大核心 2024年第1期62-68,共7页 Journal of Chinese Computer Systems

基金国家自然科学基金项目(U1804152,62101503)资助河南省科技攻关项目(222102210102)资助.

关键词动作识别图卷积网络分段时间注意力协调注意力 action recognition graph convolutional networks segmentation time attention coordinated attention

分类号 TP391 [自动化与计算机技术—计算机应用技术]