摘要
为了有效地识别学生在线课堂行为,提出了一种融合全局注意力机制和时空图卷积网络的人体骨架行为识别模型。首先在时空图卷积网络的空间图卷积网络和时间卷积网络之间加入全局注意力模块,空间图卷积网络输出的空间特征图作为注意力模块的输入。其次引入按时间维度的平均池化和最大池化操作,以增加模型学习全局特征信息的能力。最后用三个加入注意力机制的时空图卷积神经网络和类激活图(class activation map,CAM),构造对遮挡数据识别能力更强的丰富激活图卷积网络(RA-GCNv2-A)模型,并通过迁移学习实现学生在线课堂行为识别功能。在NTU-RGB+D和NTU-RGB+D120数据集上进行实验验证,与RA-GCNv2模型相比,在NTU-RGB+D和NTU-RGB+D120数据集上的识别准确率分别提高了(cross-subject,CS)1.3%、(cross-view,CV)1.2%和(cross-subject,CSub)1.6%、(cross-setup,CSet)1.4%。实验结果表明,提出的方法是一种有效的学生在线课堂行为识别方法。
In order to effectively identify students′online classroom action,a human skeleton action recognition model integrating global attention mechanism and spatiotemporal convolution network is proposed.Firstly,aglobal attention module is added between the spatial graph convolutional network and the temporal convolutional network of the Spatiotemporal graph convolutional neural network,and the spatial feature map output by the spatial graph convolutional network is used as the input of the attention module;Secondly,average pooling and maximum pooling operations according to the time dimension are introduced to increase the ability of the model to learn global feature information.Finally,three spatiotemporal graph convolutional neural networks and class activation map(CAM)added to the attention mechanism are used to construct a rich activation map convolutional network with stronger ability to recognize occlusion data(RA-GCNv2-A)model,and realize student online classroom action recognition function through transfer learning.Experimental verification is performed on the NTU-RGB+D and NTU-RGB+D120two datasets.Compared with the RA-GCNv2model,the recognition accuracy on the NTU-RGB+D dataset is increased by 1.3%(cross-subject,CS),1.2%(cross-view,CV),the recognition accuracy on the NTU-RGB+D120dataset is increased by 1.6%(cross-subject,CSub),1.4%(cross-setup,CSet)respectively.The experimental results show that the proposed method is an effective way to recognize students′online classroom action.
作者
胡锦林
齐永锋
王佳颖
HU Jinlin;QI Yongfeng;WANG Jiaying(College of Computer Science and Engineering,Northwest Normal University,Lanzhou,Gansu 730070,China)
出处
《光电子.激光》
CAS
CSCD
北大核心
2022年第2期149-156,共8页
Journal of Optoelectronics·Laser
基金
甘肃省科技计划项目(18JR3RA097)资助项目。
关键词
人体骨架
行为识别
注意力机制
时空图卷积神经网络
迁移学习
human skeleton
action recognition
attention mechanism
spatiotemporal graph convolutional neural network
transfer learning