摘要
针对现有的基于3D卷积神经网络的人体动作识别算法存在实时性较差、训练时间长、计算复杂度高等问题,提出了一种新的人体动作识别算法,采用高精度Transformer风格的骨干网络,并融合时序移位模块和轻量级注意力机制。该算法通过骨干网络CoTNeXt对上下文信息进行挖掘并进行自注意力学习,从而有效地增强动作特征。时序移位模块可以充分提取动作时序信息,而融合注意力机制可以通过增加正则化项来进一步抑制不显著的特征,从而突出显著动作特征。实验结果表明,该算法在Jester数据集和Kinetics-400数据集上的识别准确率分别达到了97.42%和75.94%,与现有的大多数人体动作识别算法相比,该算法在准确性和实时性方面表现更好。
Existing human action recognition algorithms based on 3D convolutional neural networks suffer from problems such as poor real time performance,long training time,and high computational complexity.To address these issues,this paper proposes a new human action recognition algorithm that uses a high-precision transformer-style backbone network and combines temporal displacement modules with lightweight attention mechanisms.The algorithm mines contextual information through the backbone network CoTNeXt and performs self attention learning,effectively enhancing action features.The temporal displacement module can fully extract action temporal information,while the fused attention mechanism can further suppress insignificant features by adding a regularization term,thus highlighting significant action features.Experimental results on the Jester dataset and the Kinetics-400 dataset show that the accuracy of the algorithm reaches 97.42%and 75.94%,respectively.Compared with existing human action recognition algorithms,this algorithm performs better in terms of accuracy and real-time performance.
作者
江励
周鹏飞
汤健华
Jiang Li;Zhou Pengfei;Tang Jianhua(Department of Intelligent Manufacturing,Wuyi University,Jiangmen,Guangdong 529020,China)
出处
《机电工程技术》
2023年第11期23-27,80,共6页
Mechanical & Electrical Engineering Technology
基金
广东省区域联合基金项目(2019A1515110258)
五邑大学青年创新团队项目(2018td02)。
关键词
人体动作识别
深度学习
时序移位模块
注意力机制
human action recognition
deep learning
temporal shift module
attention mechanism