面部动作单元检测方法进展与挑战被引量：2

Progress and challenges in facial action unit detection

导出

摘要人脸动作编码系统从人脸解剖学的角度定义了一组面部动作单元(action unit,AU),用于精确刻画人脸表情变化。每个面部动作单元描述了一组脸部肌肉运动产生的表观变化,其组合可以表达任意人脸表情。AU检测问题属于多标签分类问题,其挑战在于标注数据不足、头部姿态干扰、个体差异和不同AU的类别不均衡等。为总结近年来AU检测技术的发展,本文系统概述了2016年以来的代表性方法,根据输入数据的模态分为基于静态图像、基于动态视频以及基于其他模态的AU检测方法,并讨论在不同模态数据下为了降低数据依赖问题而引入的弱监督AU检测方法。针对静态图像,进一步介绍基于局部特征学习、AU关系建模、多任务学习以及弱监督学习的AU检测方法。针对动态视频,主要介绍基于时序特征和自监督AU特征学习的AU检测方法。最后,本文对比并总结了各代表性方法的优缺点,并在此基础上总结和讨论了面部AU检测所面临的挑战和未来发展趋势。 The anatomically based facial action coding system defines a unique set of atomic nonoverlapping facial muscle actions called action units(AUs),which can accurately characterize facial expression.AUs correspond to muscular activities that produce momentary changes in facial appearance.Combinations of AUs can represent any facial expression.As a multilabel classification problem,AU detection suffers from insufficient AU annotations,various head poses,individual differences,and imbalance among different AUs.This article systematically summarizes representative methods that have been proposed since 2016 to facilitate the development of AU detection methods.According to different input data,AU detection methods are categorized on the basis of images,videos,and other modalities.We also discuss how AU detection methods can deal with partial supervision given the large scale of unlabeled data.Image-based methods,including approaches that learn local facial representations,exploit AU relations and utilize multitask and weakly supervised learning methods.Handcrafted or automatically learned local facial representations can represent local deformations caused by active AUs.However,the former is incapable of representing different AUs with adaptive local regions while the latter suffers from insufficient training data.Approaches that exploit AU relations can utilize prior knowledge that some AUs appear together orexclusively at the same time.Such methods adopt either Bayesian or graph neural networks to model manually inferred AU relations from annotations of specified datasets.However,these inflexible methods fail to perform cross dataset evaluation.Multitask AU detection methods are inspired by the phenomena that facial shapes represented by facial landmarks are helpful in AU detection and facial deformations caused by active AUs affect the location distribution of landmarks.Except for detecting facial AUs,such methods typically estimate facial landmarks or recognize facial expressions in a multitask manner.Other tasks of facial emotion analysis,such as emotional dimension estimation,can be incorporated in the multitask learning setting.Video-based methods are categorized into strategies that rely on temporal representation and self-supervised learning.Temporal representation learning methods commonly adopt long short-term memory(LSTM)or 3 D convolutional neural networks(3 D-CNNs)to model the temporal information.Other temporal representation approaches utilize optical flow between frames to detect facial AUs.Several self-supervised approaches have recently exploited the prior knowledge that facial actions,which are movements of facial muscles and between facial frames,can be used as the self-supervisory signal.Such video-based weakly supervised AU detection methods are reasonable and explainable and can effectively alleviate the problem of insufficient AU annotations.However,these methods rely on massive amounts of unlabeled video data in the training phase and fail to perform AU detection in an end-to-end manner.We also review methods that exploit point cloud or thermal images for AU detection and are capable of alleviating the influence of head pose or illumination.Finally,we compare representative methods and analyze their advantages and drawbacks.The analysis summarizes and discusses challenges and potential directions of AU detection.We conclude that methods capable of utilizing weakly annotated or unlabeled data are important research directions for future investigations.Such methods should be carefully designed according to the prior knowledge of AUs to alleviate the demand for large amounts of labeled data.

作者李勇曾加贝刘昕山世光 Li Yong;Zeng Jiabei;Liu Xin;Shan Shiguang(Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China;Beijing Seetatech Technology Co.,Ltd.,Beijing 100080,China)

机构地区中国科学院计算技术研究所中国科学院大学中科视拓(北京)科技有限公司

出处《中国图象图形学报》 CSCD 北大核心 2020年第11期2293-2305,共13页 Journal of Image and Graphics

基金国家重点研发计划项目(2017YFA0700804) 国家自然科学基金项目(61702481)。

关键词面部动作单元(AU) 静态图像面部动作单元检测动态视频面部动作单元检测弱监督学习标注数据不足 facial action unit(AU) image-based AU detection video-based AU detection weakly-supervised learning insufficient annotations

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

同被引文献11

1张继威,牛少彰,曹志义,王心怡.基于深度学习和约束稀疏表达的人脸识别算法[J].北京理工大学学报,2019,39(3):255-261. 被引量：16
2闫衍芙,吕科,薛健,王聪,甘玮.基于深度学习和表情AU参数的人脸动画方法[J].计算机辅助设计与图形学学报,2019,31(11):1973-1980. 被引量：13
3王海涌,梁红珠.基于改进的GAN的局部遮挡人脸表情识别[J].计算机工程与应用,2020,56(5):141-146. 被引量：19
4林克正,白婧轩,李昊天,李骜.深度学习下融合不同模型的小样本表情识别[J].计算机科学与探索,2020,14(3):482-492. 被引量：14
5刘全明,辛阳阳.端到端的低质人脸图像表情识别[J].小型微型计算机系统,2020,41(3):668-672. 被引量：17
6吕诲,童倩倩,袁志勇.基于人脸分割的复杂环境下表情识别实时框架[J].计算机工程与应用,2020,56(12):134-140. 被引量：16
7王慧星,黄勃,高永彬,谢源.部分遮挡人脸识别的方法综述[J].武汉大学学报（理学版）,2020,66(5):451-461. 被引量：12
8廖海斌,徐斌.基于性别和年龄因子分析的鲁棒性人脸表情识别[J].计算机研究与发展,2021,58(3):528-538. 被引量：9
9蒋斌,钟瑞,张秋闻,张焕龙.采用深度学习方法的非正面表情识别综述[J].计算机工程与应用,2021,57(8):48-61. 被引量：7
10韦赛远,林丽媛,张怡然.基于M-Xception网络的戴口罩人脸表情识别[J].天津科技大学学报,2021,36(3):72-76. 被引量：4

引证文献2

1蒋斌,李南星,钟瑞,吴庆岗,常化文.人脸部分遮挡条件下表情识别研究的新进展[J].计算机工程与应用,2022,58(12):12-24. 被引量：2
2章毅,吕嘉仪,兰星,薛健.结合面部动作单元感知的三维人脸重建算法[J].软件学报,2024,35(5):2176-2191.

二级引证文献2

1蒋斌,崔晓梅,江宏彬,丁汉清,袁俊岭.轻量级网络在人脸表情识别上的新进展[J].计算机应用研究,2024,41(3):663-670. 被引量：1
2蒋文豪.多尺度注意力机制下的人脸表情识别算法设计[J].现代计算机,2024,30(4):29-33.

1李德军,何春荣,赵桥生,肖冬林,杨申申.潜水器协同作业下的多自由度运动建模与仿真研究[J].船舶力学,2020,24(3):342-351. 被引量：1
2林通,陈新,唐晓,贺玲,李浩.基于双流卷积神经网络和生成式对抗网络的行人重识别算法[J].信息技术与网络安全,2020,39(6):7-12. 被引量：5
3李晓明.分类问题中的算法[J].中国信息技术教育,2020(23):24-28.
4万齐,李新春,梁长虹.影像组学评价非小细胞肺癌疗效进展[J].中国医学影像技术,2020,36(11):1718-1721. 被引量：4
5葛玉辉.无人机综合技术在某艰险山区铁路勘测中的应用[J].铁道勘察,2020,46(6):73-76. 被引量：8
6年福东,束建华,吕刚.基于自适应特征比较的少样本学习算法[J].西安文理学院学报（自然科学版）,2020,23(4):50-56.
7马云漪,卢建旗,李山有,何沛阳.基于线源模型的中国仪器地震烈度衰减规律[J].内陆地震,2020,34(4):330-339. 被引量：1
8曹煜旋.如何在芭蕾舞教学中培养学生的舞蹈表现力[J].艺术家,2020(10):129-129.
9卢莉,马力.基于情感词组合模式的情感细分类研究[J].计算机与数字工程,2020,48(11):2702-2706.
10李珊,邓伟洪.深度人脸表情识别研究进展[J].中国图象图形学报,2020,25(11):2306-2320. 被引量：30

中国图象图形学报

2020年第11期

浏览历史

内容加载中请稍等...

面部动作单元检测方法进展与挑战被引量：2

同被引文献11

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

面部动作单元检测方法进展与挑战 被引量：2

同被引文献11

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

面部动作单元检测方法进展与挑战被引量：2