期刊文献+

可分离长短期注意力网络的手势识别方法 被引量:2

Gesture recognition method with separable long short-term attention networks
下载PDF
导出
摘要 在人机交互领域中,大多数手势识别算法无法有效地消除采集背景对待提取手势区域的影响。与此同时,对手势运动信息的准确建模也存在困难。针对目前人机交互中的上述问题,提出利用深度可分离残差卷积长短期记忆(LSTM)网络的方法对动态手势的特征信息进行建模和识别。首先,利用常规3D卷积操作对输入的视频帧进行特征的初步提取,通过较大的卷积核尺寸以扩大其感受野;然后,通过可分离卷积残差操作对输入的浅层特征进行特征的再提取,实现对高维特征的提取建模;最后,将经过前两个阶段提取出的特征信息经过3D池化操作后输入到LSTM网络中,对输入的视频数据的时序信息进行建模,并在输入中引入注意力机制。在大规模孤立手势数据集上进行的相关实验结果表明,所提方法的准确率与原始的围绕稀疏关键点的混合特征(MFSK)+视觉词袋(BoVW)+支持向量机(SVM)网络相比提高了21.02个百分点。 Most gesture recognition algorithms in the human-computer interaction field cannot effectively eliminate the influence of the acquisition background on the extraction gesture area.At the same time,it is difficult to accurately model the motion information of the gesture.In view of the above problems in human-computer interaction,separable Long Short-Term Memory(LSTM)network for gesture recognition was proposed to model and recognize the feature information.First,the preliminary extraction of the input video frame by conventional 3D convolution operation was carried out.A large convolutional size was chosen to expand the receptive field.Then,the shallow features were re-extracted with separable convolutional residual operation and constructed the model of high-dimensional features.Finally,the feature information extracted through the first two steps was entered into a LSTM network after 3D pooling.The timing information of the video data was modeled,and attention mechanism was introduced into the input.Experimental results on a large-scale isolated gesture dataset show that the accuracy of the proposed method is 21.02 percentage points higher than that of the original MFSK(Mixed features around Sparse Keypoints)+BoVW(Bag of Visual Words)+SVM(Support Vector Machine)network.
作者 顾明 李轶群 张二超 张训雷 齐林 帖云 GU Ming;LI Yiqun;ZHANG Erchao;ZHANG Xunlei;QI Lin;TIE Yun(Henan Communications Investment Group Company Limited,Zhengzhou Henan 450016,China;Zhengzhou Branch,Zhongxun Post&Telecommunication Consulting&Design Institute Company Limited,Zhengzhou Henan 450000,China;School of Information Engineering,Zhengzhou University,Zhengzhou Henan 450001,China)
出处 《计算机应用》 CSCD 北大核心 2022年第S01期59-63,共5页 journal of Computer Applications
关键词 深度残差网络 可分离卷积 长短期记忆网络 动态手势识别 注意力机制 deep residual network separable convolution Long Short-Term Memory(LSTM)network dynamic gesture recognition attention mechanism
  • 相关文献

参考文献1

二级参考文献17

  • 1NEWCOMBE R A, IZADI S, HILI,IGES O, et al. KineetFusion: real-lime dense surtaee mapping and tracking [ C]// Proceedings of the 2011 IF, EE International Symposium on Mixed and Augmented Reality. VCashinglon, DC: IEEE Computer Society, 2011: 127- 136.
  • 2WACItS J P. KOLSCH M, STERN H, et al. Vision-based hand- gesture applications [J] Communications of the ACM, 2011, 54 (2): 60 -70.
  • 3SAMUEL,D, RATHI Y, A. TANNENBAUM A. A framework for image segmentation using shape models and kernel space shape pri- ors [J]. IEEE Transactions of Pattern Analysis and Machine Intelii-genee, 2008, 30(8): 1385 -1399.
  • 4DARDAS N H, GEORGANAS N D. Real-time hand gesture detec- tion and recognition using bag-of-features and support vector machine techniques [ J]. IEEE Transactions on Instrumentation & Measure- ment, 2011, 60( 1 1 ) : 3592 - 3607.
  • 5BELONGIE S, MALIK J, PUZICHA J. Shape matching and object recognition using shape contexts [ J]. IEEE Transaetions on Pattern Analysis and Machine Intelligence, 2002, 24(4): 509 -522.
  • 6CHENG M M, ZHANG Z M, I,IN W Y. BING: binarized normed gTadients for objectness estimation at 300fps [ C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recogni- tion. Washington, DC: 1EEE Computer Society, 2014: 3286- 3293.
  • 7STRIGL, KOFLER K, PODLIPNIG S. Perforulanc: and scalability of GPU-based convolutional neural networks [ C ]// Reedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing. Piscataway, NJ: IEEE, 2010: 317- 324.
  • 8BOJIC N, PANG K. Adaptiw skin segmentation for head and shoulder video sequences [ C]//Visual Communiealions and Image Processing 2000. Bellingham, WA: SPIE, 2000:704-711.
  • 9KOVAC J, PEIt P, SOLINA F. Human skin color clustering for face detection [ C]// IEEE Region 8 EUROCON 2003. Computer as a Tool. Piseataway, NJ: IEEE, 2003, 2: 144- 148.
  • 10FAN R E, CHANG K W, HSIEH C J, et al. Liblinear: a library. for large linear classification [ J]. lournal of Machine Learning Re- search. 2008, 9(12) : 1871 - 1874.

共引文献13

同被引文献24

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部