摘要
动态手势视频流预处理过程中,随机采样或密集采样存在关键帧丢失或数据冗余的问题,导致特征融合在单个特征的时序建模中,可能丢失重要的时序信息,由此提出基于注意力机制和特征融合的手势识别方法。通过含注意力机制的长短期记忆网络,在时序建模过程中抽取重要数据,有效避免了采样方法的随意性或盲目性;设计具有三层结构的特征融合网络对抽取的RGB特征和深度图像特征进行融合处理,提升了动态手势识别的准确率。实验结果表明引入注意力机制的必要性,验证了特征融合的有效性和该方法的鲁棒性。
In the preprocessing of dynamic gesture video stream,random or dense sampling may lead to key frames missing or data redundancy,resulting in feature fusion in the temporal modeling of single feature,and losing important temporal information.Therefore,we propose a gesture recognition method based on attention mechanism and feature fusion.The important data were extracted in the process of temporal modeling using the LSTM with attention mechanism which effectively avoided the randomness or blindness of the sampling method;a feature fusion network with three-layer structure was designed to fuse the extracted RGB features and depth image features,which improves the accuracy of dynamic gesture recognition.The experimental results show the necessity of introducing attention mechanism,and verify the effectiveness of feature fusion and the robustness of our method.
作者
高明柯
赵卓
逄涛
王天保
邹一波
黄晨
李德旭
Gao Mingke;Zhao Zhuo;Pang Tao;Wang Tianbao;Zou Yibo;Huang Chen;Li Dexu(The 32nd Research Institute,China Electronics Technology Group Corporation,Shanghai 201808,China;College of Information Technology,Shanghai Ocean University,Shanghai 201306,China;School of Computer Engineering and Science,Shanghai University,Shanghai 200444,China)
出处
《计算机应用与软件》
北大核心
2020年第6期199-203,共5页
Computer Applications and Software
基金
装备预研中国电科联合基金项目(6141B08080101)
上海海洋大学博士启动基金项目(A2-0203-00-100378)。