摘要
针对传统方法在通过视频数据进行人体行为识别的过程中,无法准确分析长时间范围的运动信息,不能很好地利用运动信息中的局部特征和其空间关系.提出将基于注意力机制的卷积长短时记忆神经网络(Attention-ConvLSTM)与传统的双流卷积进行结合,实现了对视频数据中运动信息的非线性特征更好的学习,对局部显著特征及其空间关系更好的利用.本文还设计了新的正则交叉熵损失函数,使得扩展之后的神经网络实现更快的收敛.本文的方法在UCF101和HMDB51两个通用人体行为视频数据集上的表现相较于传统的方法有明显的提升.
In the process of human behavior recognition based on video data,traditional methods can't accurately analyze the motion information in a long time range,and can't make good use of the local features and their spatial relations in the motion information.In this paper,the convolution long-term memory neural network(Attention-ConvLSTM)based on attention mechanism is combined with the traditional two stream convolution to realize better learning of the non-linear features of motion information in video data,and to make better use of the local salient features and their spatial relations.This paper also designs a new regularized cross-entropy loss function,which makes the extended neural network achieve faster convergence.Compared with the traditional methods,the performance of our method in UCF101 and HMDB51 is significantly improved.
作者
揭志浩
曾明如
周鑫恒
何强
JIE Zhi-hao;ZENG Ming-ru;ZHOU Xin-heng;HE Qiang(Information Engineering College,Nanchang University,Nanchang 330031,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2021年第2期405-408,共4页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61963027)资助.