摘要
为了解决自然场景下人脸表情识别任务中的无用信息干扰和遮挡对识别性能的影响问题,提出一种基于关键区域遮挡与重建的人脸表情识别模型。利用多尺度特征提取网络,提取人脸图像的全局特征。根据68个人脸关键点划分出68个关键区域,并通过插值法提取68个关键区域的特征,同时采用注意力机制学习关键区域特征之间的先验关系。设计自监督的遮挡与重建模块,对关键区域特征进行随机遮挡,并利用已知区域信息来预测和重建被遮挡区域的特征,从而提高模型在自然场景下的表情识别性能。设计多个实验验证了该模型的泛化能力,并通过消融实验验证了模型中每个模块的有效性。实验结果表明,该模型在真实世界的情感面孔数据集(RAF-DB)和Occlusion-RAF-DB数据集上分别达到了88.44%和86.09%的识别准确率,相比于视觉Transformer(Vi T)等模型有效地提升了自然场景下人脸表情识别的性能。
To overcome the negative impact of irrelevant information interference and masking issues on the performance of facial expression recognition in the wild,this study proposes a facial expression recognition model based on key region masking and reconstruction.A multi-scale feature extraction network is first used to extract global features from facial images.Thereafter,the features of key regions,based on 68 facial landmarks,are extracted and encoded with attention mechanisms to learn prior relationships between the features of the key regions.To further enhance the discriminative capability of the extracted features for improved recognition performance,a key region masking and reconstruction module is designed based on self-supervised learning.This module aims to reconstruct randomly masked features of key regions using known region information.Extensive experiments are conducted to validate the generalization ability of the model,and ablation experiments confirm the effectiveness of each module in the model.The experimental results demonstrate that the model achieves recognition accuracies of 88.44%and 86.09%on the Real-world Affective Faces DataBase(RAF-DB)and the Occlusion-RAF-DB dataset,respectively,effectively improving the performance of facial expression recognition in natural scenarios compared to models such as Vision Transformer(ViT).
作者
李晶
李健
陈海丰
张倩
王丽燕
裴二成
LI Jing;LI Jian;CHEN Haifeng;ZHANG Qian;WANG Liyan;PEI Ercheng(School of Electronic Information and Artificial Intelligence,Shaanxi University of Science and Technology,Xi'an 710021,Shaanxi,China;School of Arts and Sciences,Shaanxi University of Science and Technology,Xi'an 710021,Shaanxi,China;School of Computer Science and Technology,Xi'an University of Posts and Telecommunications,Xi'an 710100,Shaanxi,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2024年第5期241-249,共9页
Computer Engineering
基金
国家自然科学基金(62306172)
国家土建结构预制装配化工程技术研究中心沈祖炎专项基金(2019CPCCE-K02)
陕西省自然科学基础研究计划项目(2022JQ-662)
2021年陕西科技大学教育信息化教学改革研究项目(JXJG2021-09)
陕西科技大学博士科研启动基金(126022325)。
关键词
人脸表情识别
多尺度关键区域特征
注意力机制
自监督学习
遮挡与重建
facial expression recognition
multiscale key region feature
attention mechanism
self-supervised learning
masking and reconstruction