摘要
长短期记忆网络(LSTM)广泛应用于视频序列的人脸表情识别,针对单层LSTM表达能力有限,在解决复杂问题时其泛化能力易受制约的不足,提出一种层级注意力模型:使用堆叠LSTM学习时间序列数据的分层表示,利用自注意力机制构建差异化的层级关系,并通过构造惩罚项,进一步结合损失函数优化网络结构,提升网络性能.在CK+和MMI数据集上的实验结果表明,由于构建了良好的层次级别特征,时间序列上的每一步都从更感兴趣的特征层级上挑选信息,相较于普通的单层LSTM,层级注意力模型能够更加有效地表达视频序列的情感信息.
LSTM network is widely used in facial expression recognition of video sequences.In view of the limited representation ability of single-layer LSTM and the limitation of its generalization ability when solving complex problems,a hierarchical attention model is proposed.Hierarchical representation of time series data is learned by stacking LSTM,self-attention mechanism is used to construct differentiated hierarchical relationships,and a penalty term is constructed and further combined with the loss function to optimize the network performance.Experiments on CK+and MMI datasets,demonstrate that due to the construction of good hierarchical features,each step in time series can select information from the more interesting feature hierarchy.Compared with ordinary single-layer LSTM,hierarchical attention model can express the emotional information of video sequences more effectively.
作者
王晓华
潘丽娟
彭穆子
胡敏
金春花
任福继
Wang Xiaohua;Pan Lijuan;Peng Muzi;Hu Min;Jin Chunhua;Ren Fuji(School of Computer Science and Information Engineering,School of Artificial Intelligence,Hefei University of Technology,Hefei 230601;The Laboratory for Internet of Things and Mobile Internet Technology of Jiangsu Province,Huaiyin Institute of Technology,Huai’an 223001;Graduate School of Advanced Technology&Science,University of Tokushima,Tokushima 7708502)
出处
《计算机辅助设计与图形学学报》
EI
CSCD
北大核心
2020年第1期27-35,共9页
Journal of Computer-Aided Design & Computer Graphics
基金
国家自然科学基金(61672202)
国家自然科学基金重点项目(61432004)
江苏省物联网移动互联技术工程实验室开放课题(JSWLW-2017-017)
关键词
视频序列
人脸表情识别
堆叠长短期记忆网络
自注意力机制
video sequences
facial expression recognition
stacked long short-term memory network
self-attention mechanism