期刊文献+

端到端增强特征神经网络的视频表情识别 被引量:4

Video Expression Recognition Based on End-to-End Enhanced Feature Neural Network
下载PDF
导出
摘要 以卷积神经网络结合循环神经网络搭建端到端的深度学习网络,提出一种增强特征的视频表情识别方法。其中,卷积神经网络采用传统的VGG-16-FACE模型作为初始模型,完成特征提取;循环神经网络采用具有记忆能力的长期短时记忆模型网络(LSTM)结合连续视频的帧间信息给出最优预测。首先,对VGG-16和LSTM模型分别进行独立训练。因预测结果很大程度取决于LSTM模型,针对LSTM的层数和输出神经元个数进行优化调试后,得到两层LSTM,发现输出维度为2 048时识别效果最好。考虑到增加负责特征提取的VGG模型对预测结果的影响比重,模型由独立训练2个模型连接为端到端的1个模型。实验中考虑到1层LSTM输出会造成特征丢失,在端到端模型的基础上加入跳层连接,增强特征输入,最终实验结果表明:在AFEW数据集上对视频表情识别的准确率从32. 88%提升到37. 34%,F1分数从0. 289 5提升到0. 339 9,证实了端到端增强特征混合神经网络的有效性。 In this paper,a convolutional neural network( CNN) combined with a recurrent neural network( RNN) is built into an end-to-end deep learning network,and a video feature recognition method with enhanced features is proposed. The traditional VGG-16-FACE model is used as the initial model to complete the feature extraction. The long-term short-term memory( LSTM) model network with memory capability is combined with the inter-frame information of continuous video to make the optimal prediction. At the beginning of the experiment, the VGG and LSTM models were independently trained,and the prediction results largely depended on LSTM. We optimized the number of layers and output neurons for LSTM,and then theexperiments show that the results are better when the two-layer LSTM and its output are both 2 048. Then considering the increase of the influence of the VGG model responsible for feature extraction on the prediction results,the model is connected by an independent training model to an end-to-end model. In addition,the loss of features is caused by one layer of LSTM. The cross-layer connection is added on the basis of the end-to-end model to enhance feature input. The final experimental result improves the accuracy of video expression recognition from 32. 88% to 37. 34% and F1 score from 0. 289 5 to 0. 339 9 on the AFEW data set. It also confirmed the effectiveness of the end-to-end enhanced feature hybrid neural network.
作者 陈乐 童莹 陈瑞 曹雪虹 CHEN Le;TONG Ying;CHEN Rui;CAO Xuehong(College of Telecommunications & Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003 , China;Department of Communication Engineering, Nanjing Institute of Technology, Nanjing 211167 , China)
出处 《重庆理工大学学报(自然科学)》 CAS 北大核心 2019年第9期125-131,共7页 Journal of Chongqing University of Technology:Natural Science
基金 国家自然科学基金青年项目(61703201) 江苏省自然科学基金青年项目(BK20170765)
关键词 视频表情识别 卷积神经网络 循环神经网络 深度学习 video expression recognition recurrent neutral network convolutional neutral network deep learning
  • 相关文献

参考文献2

二级参考文献27

  • 1王宇博,艾海舟,武勃,黄畅.人脸表情的实时分类[J].计算机辅助设计与图形学学报,2005,17(6):1296-1301. 被引量:14
  • 2程剑,应自炉.基于二维主分量分析的面部表情识别[J].计算机工程与应用,2006,42(5):32-33. 被引量:9
  • 3肖柏旭,张丽静.基于分流抑制机制的卷积神经网络人脸检测法[J].计算机应用,2006,26(B06):46-48. 被引量:4
  • 4TIAN Y, KANADE T, COHN J. Evaluation of Gabor wavelet-based facial action unit recognition in image sequences of increasing complexity [C] // Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition. Washington: IEEE, 2002:26-30.
  • 5MULLER S, WALLHOFF F, HULSKEN F, et al. Facial expression recognition using pseudo 3-D hidden Markov models [C] // Proceedings of International Conference on Pattern Recognition. Quebec City : [ s. n. ], 2002:32 - 35.
  • 6ZHANG Y, JI Q. Active and dynamic information fusion for facial expression understanding from image sequences [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2005, 27(5) : 699 - 714.
  • 7KAPPOR A, QI Y, PICARD R W. Fully automatic upper facial action recognition [C] // Proceedings of Analysis and Modeling of Faces and Gestures. Nice, France: [s.n.], 2003: 195-202.
  • 8LIU C, SHUM H Y. Kullback-Leibler boosting [C] // Proceedings of Computer Society Conference on Computer Vision and Pattern Recognition. Wisconsin: IEEE, 2003 : 587 - 594.
  • 9BARTLETT M S, LITTLEWOET G, FRANK M, et al. Recognizing facial expression: machine learning and application to spontaneous behavior [C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego: IEEE, 2005: 568- 573.
  • 10PANTIC M, ROTHKRANTZ L. Facial action recognition for facial expression analysis from static face images [J]. IEEE Transactions on Systems, Man and Cybernetics:Part B, 2004, 34 (3) : 1449 - 1461.

共引文献18

同被引文献20

引证文献4

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部