小数据样本深度迁移网络自发表情分类被引量：9

Classification of small spontaneous expression database based on deep transfer learning network

导出

摘要目的相较于传统表情,自发表情更能揭示一个人的真实情感,在国家安防、医疗等领域有巨大的应用潜力。由于自发表情具有诱导困难、样本难以采集等特殊性,因此数据样本较少。为判别自发表情的种类,结合在越来越多的场景得到广泛应用的神经网络学习方法,提出基于深度迁移网络的表情种类判别方法。方法为保留原始自发表情图片的特征,即使在小数据样本上也不使用数据增强技术,并将光流特征3维图像作为对比样本。将样本置入不同的迁移网络模型中进行训练,然后将经过训练的同结构的网络组合成同构网络并输出结果,从而实现自发表情种类的判别。结果实验结果表明本文方法在不同数据库上均表现出优异的自发表情分类判别特性。在开放的自发表情数据库CASME、CASMEⅡ和CAS(ME)~2上的测试平均准确率分别达到了94. 3%、97. 3%和97. 2%,比目前最好测试结果高7%。结论本文将迁移学习方法应用于自发表情种类的判别,并对不同网络模型以及不同种类的样本进行比较,取得了目前最优的自发表情种类判别的平均准确率。 Objective Expression is important in human-computer interaction.As a special expression,spontaneous expression features shorter duration and weaker intensity in comparison with traditional expressions.Spontaneous expressions can reveal a person’s true emotions and present immense potential in detection,anti-detection,and medical diagnosis.Therefore,identifying the categories of spontaneous expression can make human-computer interaction smooth and fundamentally change the relationship between people and computers.Given that spontaneous expressions are difficult to be induced and collected,the scale of a spontaneous expression dataset is relatively small for training a new deep neural network.Only ten thousand spontaneous samples are present in each database.The convolutional neural network shows excellent performance and is thus widely used in a large number of scenes.For instance,the approach is better than the traditional feature extraction method in the aspect of improving the accuracy of discriminating the categories of spontaneous expression.Method This study proposes a method on the basis of different deep transfer network models for discriminating the categories of spontaneous expression.To preserve the characteristics of the original spontaneous expression,we do not use the technique of data enhancement to reduce the risk of convergence.At the same time,training samples,which comprise three-dimensional images that are composed of optical flow and grayscale images,are compared with the original RGB images.The threedimensional image contains spatial information and temporal displacement information.In this study,we compare three network models with different samples.The first model is based on Alexnet that only changes the number of output layer neurons that is equal to the number of categories of spontaneous expression.Then,the network is fine-tuned to obtain the best training and testing results by fixing the parameters of different layers several times.The second model is based on Inception V3.Two fully connected layers whose neuron numbers are equal to 512 and the number of spontaneous expression categories,respectively,are added to the output results.Thus,we only need to fine-tune the parameters of the two layers.Network depth increases with a reduction of the number of parameters due to the 3×3 convolution kernel replacing the 7×7 convolution kernel.The third model is based on Inception-ResNet-v2.Similar to the first model,we only change the number of output layer neurons.Finally,the isomorphic network model is proposed to identify the categories of spontaneous expression.The model is composed of two transfer learning networks of the same type that are trained by different samples and then takes the maximum as the final output.The isomorphic network makes decisions with high accuracy because the same output of the isomorphic network is infinitely close to the standard answer.From the perspective of probability,we take the maximum of different outputs as a prediction value.Result Experimental results indicate that the proposed method exhibits excellent classification performance on different samples.The single network output clearly shows that the features extracted from RGB images are as effective as the features extracted from the three-dimensional images of optical flow.This result indicates that spatiotemporal features extracted by the optical flow method can be replaced by features that are extracted from the deep neural network.Simultaneously,the method shows that at a certain degree,features extracted from the neural network can replace the lost information and features,such as the temporal features of RGB images or color features of OF+images.The high average accuracy of a single network indicates that it has good testing performance on each dataset.Networks with high complexity perform well because the samples of spontaneous expression can train the deep transfer learning network effectively.The proposed models achieve state-of-the-art performance and an average accuracy of over96%.After analyzing the result of the isomorphic network model,we know that its expression is not better than that of a single network in some cases because a single network has a high confidence degree in discriminating the categories of spontaneous expression and thus,the isomorphic network cannot easily improve the average accuracy.The Titan Xp used for this research was donated by the NVIDIA Corporation.Conclusion Compared with traditional expression,spontaneous expression is able to change subtly and extract features in a difficult manner.In the study,different transfer learning networks are applied to discriminate the categories of spontaneous expression.Concurrently,the testing accuracies of different networks,which are trained by different kinds of samples,are compared.Experimental results show that in contrast to traditional methods,deep learning has obvious advantages in spontaneous expression feature extraction.The findings also prove that deep network can extract complete features from spontaneous expression and that it is robust on different databases because of its good testing results.In the future,we will extract spontaneous expressions directly from videos and identify the categories of spontaneous expression with high accuracy by removing distracting occurrences,such as blinking.

作者付晓峰吴俊牛力 Fu Xiaofeng;Wu Jun;Niu Li(School of Computer Science and Technology,Hangzhou Dianzi Universty,Hangzhou 310018. China)

机构地区杭州电子科技大学计算机学院

出处《中国图象图形学报》 CSCD 北大核心 2019年第5期753-761,共9页 Journal of Image and Graphics

基金国家自然科学基金项目(61672199 61572161) 浙江省科技计划项目--2018年度重点研发计划项目(2018C01030) 浙江省自然科学基金项目(Y1110232)~~

关键词自发表情迁移学习分类神经网络同构网络 spontaneous expression transfer learning classification neural networks isomorphic network

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献2

1薛雨丽,毛峡,郭叶,吕善伟.人机交互中的人脸表情识别研究进展[J].中国图象图形学报,2009,14(5):764-772. 被引量：49
2张水发,张文生,丁欢,杨柳.融合光流速度与背景建模的目标检测方法[J].中国图象图形学报,2011,16(2):236-243. 被引量：17

二级参考文献86

1杨常清,王孝通,李博,金良安.基于特征光流的角点匹配快速算法[J].光电工程,2006,33(4):85-88. 被引量：4
2薛雨丽,毛峡,张帆.BHU人脸表情数据库的设计与实现[J].北京航空航天大学学报,2007,33(2):224-228. 被引量：20
3Belhumeur P N, Hespanha J P, Kriegman D J. Eigenfaces: vs. fisherfaces: recognition using class specific linear projection [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(7) : 711-720.
4Sire T, Baker S, Bsat M. The CMU pose, illumination, and expression (PIE) database [ A ] . In: Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition [C] , Washington, DC, USA, 2002: 46-51.
5Martinez A M, Benavente R. The AIR face database [ R]. Technical Report 24, The Computer Vision Center (CVC), Barcelona, Spain, 1998.
6Hwang B W, Rob M C, Lee S W. Performance evaluation of face recognition algorithms on Asian face database [ A ]. In: Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition [ C ], Seoul, South Korea, 2004 : 278-283.
7Gau W, Cao B, Sban S, et al. The CAS-PEAL large-scale chinese face database and baseline evaluations [ J ]. IEEE Transactions on Systems, Man and Cybernetics, Part A, 2008, 38( 1 ) : 149-161.
8Pantic M, Rothkrantz L J M. Automatic analysis of facial expressions: the state of the art [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22 ( 12 ) : 1424-1446.
9Fasel B, Luettin J. Automatic facial expression analysis: a survey [ J]. Pattern Recognition, 2003, 36 ( 1 ) : 259-275.
10Essa I, Pentland A. Coding, analysis, interpretation, and recognition of facial expressions [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(7): 757-763.

共引文献64

1丁名都,李琳.基于CNN和HOG双路特征融合的人脸表情识别[J].信息与控制,2020,49(1):47-54. 被引量：17
2邱祯艳,王修晖.一种结合Grabcut的Vibe目标检测算法[J].中国计量学院学报,2012,23(3):250-256. 被引量：14
3宋怀波,史建强.应用PCA理论进行多人脸姿态估计的方法[J].吉林大学学报（工学版）,2013,43(S1):43-46. 被引量：3
4曹巨江,蒙巧利.三维扫描仪技术在用户界面设计中的研究[J].包装工程,2010,31(12):53-56. 被引量：1
5刘静霞,史云兵,徐鲁强.情感状态模糊识别的研究[J].四川师范大学学报（自然科学版）,2010,33(5):707-710. 被引量：3
6徐鲁强,刘静霞.模糊支持向量机情感状态识别的研究[J].计算机应用研究,2011,28(3):831-832. 被引量：1
7蒋斌,贾克斌,杨国胜.人脸表情识别的研究进展[J].计算机科学,2011,38(4):25-31. 被引量：31
8黄永明,章国宝,董飞,达飞鹏.基于Gabor、Fisher脸多特征提取及集成SVM的人脸表情识别[J].计算机应用研究,2011,28(4):1536-1539. 被引量：15
9王志良,郑思仪,王先梅,王巍.心理认知计算的研究现状及发展趋势[J].模式识别与人工智能,2011,24(2):215-225. 被引量：13
10刘帅师,田彦涛,万川.基于Gabor多方向特征融合与分块直方图的人脸表情识别方法[J].自动化学报,2011,37(12):1455-1463. 被引量：76

同被引文献66

1丁名都,李琳.基于CNN和HOG双路特征融合的人脸表情识别[J].信息与控制,2020,49(1):47-54. 被引量：17
2黄宇霞,罗跃嘉.国际情绪图片系统在中国的试用研究[J].中国心理卫生杂志,2004,18(9):631-634. 被引量：99
3白露,马慧,黄宇霞,罗跃嘉.中国情绪图片系统的编制——在46名中国大学生中的试用[J].中国心理卫生杂志,2005,19(11):719-722. 被引量：317
4冯晓杭,张向葵.自我意识情绪:人类高级情绪[J].心理科学进展,2007,15(6):878-884. 被引量：37
5丁军,苏林雁,高雪屏,马静,冯哲,文慧.国际情绪图片系统(IAPS)在中国10-12岁儿童的初步应用研究[J].中国临床心理学杂志,2010,18(2):168-170. 被引量：3
6徐鹏飞,黄宇霞,罗跃嘉.中国情绪影像材料库的初步编制和评定[J].中国心理卫生杂志,2010,24(7):551-554. 被引量：32
7龚栩,黄宇霞,王妍,罗跃嘉.中国面孔表情图片系统的修订[J].中国心理卫生杂志,2011,25(1):40-46. 被引量：189
8王晓慧,贾珈,蔡莲红.基于小波图像融合的表情细节合成[J].计算机研究与发展,2013,50(2):387-393. 被引量：8
9储莉,吴小俊.基于高斯混合模型的分量对称正定矩阵模型[J].模式识别与人工智能,2018,31(11):979-985. 被引量：2
10张建华,孔繁涛,吴建寨,翟治芬,韩书庆,曹姗姗.基于改进VGG卷积神经网络的棉花病害识别模型[J].中国农业大学学报,2018,23(11):161-171. 被引量：109

引证文献9

1刘敏.大数据背景下神经元复杂网络同步转迁数值设计探讨[J].佳木斯大学学报（自然科学版）,2020,38(1):63-66.
2谢将剑,杨俊,邢照亮,张卓,陈新.多特征融合的鸟类物种识别方法[J].应用声学,2020,39(2):199-206. 被引量：15
3许霜梅.基于矩阵分解模型的图片数据库构建[J].电子设计工程,2020,28(15):167-170. 被引量：1
4宋素涛,张文君.心理学研究中常用的情绪数据库[J].鲁东大学学报（哲学社会科学版）,2020,37(5):67-75. 被引量：1
5王成玥.基于视觉信息传达的网页广告智能推送系统设计[J].现代电子技术,2020,43(20):160-163. 被引量：3
6付晓峰,牛力,胡卓群,李建军,吴卿.基于过渡帧概念训练的微表情检测深度网络[J].浙江大学学报（工学版）,2020,54(11):2128-2137. 被引量：4
7亢洁,李思禹.基于注意力机制的人脸表情识别迁移学习方法[J].计算机工程与设计,2021,42(3):797-804. 被引量：8
8李冬,盛亮,李阳,段宝军.基于神经网络的大孔径厚针孔成像复原算法研究[J].强激光与粒子束,2022,34(6):60-65.
9吴俊.基于对抗分解卷积网络的自发微表情种类判别[J].现代信息科技,2024,8(14):26-29.

二级引证文献32

1黄戌霞,陈明磊,苏锋.翻转课堂的H5数字资源平台的设计与实现——以“CSS+DIV网页设计”课程为例[J].现代信息科技,2020,4(18):172-175.
2吕秀丽,陈帅男.基于卷积神经网络的丹顶鹤定位识别[J].电子测量技术,2020,43(20):104-108. 被引量：4
3李兰兰,宋永鹏.基于WEB的图片加载优化技术研究[J].电子设计工程,2021,29(14):159-162. 被引量：1
4阙鑫华,乔倩,蒋慧,吴旭成,柴晨思,王瑞,郑红.基于改进DTW算法的海岛水鸟鸣声识别应用研究[J].农村经济与科技,2021,32(11):320-322. 被引量：2
5肖华英.基于视觉传达的嵌瓷视频图像自动定位系统设计[J].自动化与仪器仪表,2021(9):138-142. 被引量：1
6蒋雨肖,丁晟春,吴鹏.基于BiLSTM-VGG16的多模态信息特征分类研究[J].情报理论与实践,2021,44(11):180-186. 被引量：15
7张猛,李健.鸟类音频数据预处理方法[J].数据与计算发展前沿,2021,3(5):130-140. 被引量：1
8杨俊,赵强,李智勇,张慧,李宁,周宏文.鉴于鸟类物种差异的输电线塔智能音频驱鸟设备[J].仪器仪表与分析监测,2022(1):1-6. 被引量：5
9谢崇波.一种注意力机制下的空气污染物预测方法[J].自动化与仪器仪表,2022(2):52-56.
10王宇,宁媛,陈进军.基于ShuffleNet V2算法的三维视线估计[J].计算技术与自动化,2022,41(1):87-92. 被引量：1

1孙劲光,严华.基于概率图模型的表情分类方法研究[J].辽宁工程技术大学学报（自然科学版）,2018,37(6):932-938. 被引量：1
2瞿道庆,马琳,邵珠宏.结合深度度量学习的血缘关系识别[J].中国图象图形学报,2018,23(10):1540-1548.
3孙靖文,王敏.强化组合式生成对抗网络[J].电子测量技术,2019,42(4):99-103. 被引量：2
4沈利迪.基于深度卷积神经网络的脸部表情分类研究[J].电子设计工程,2019,27(5):184-188. 被引量：4
5黄秀,符冉迪,金炜,李云飞,蔡永香.基于图像差分与卷积深度置信网络的表情识别[J].光电子．激光,2018,29(11):1228-1236. 被引量：7
6黄寿昌,黄力.基于多特征光谱融合的蔬菜农药残留检测模型[J].内蒙古师范大学学报（自然科学汉文版）,2017,46(5):664-667. 被引量：1
7孔繁图,麦燕华,亓孟科,吴艾茜,郭芙彤,贾启源,李永宝,宋婷,周凌宏.基于神经网络学习方法的放疗计划三维剂量分布预测[J].南方医科大学学报,2018,38(6):683-690. 被引量：7
8丁松涛,曲仕茹.基于深度学习的交通目标感兴趣区域检测[J].中国公路学报,2018,31(9):167-174. 被引量：16
9钱丰,梅剑平,潘荣胜.深度学习在汽车制造物流规划工作中的应用[J].物流技术,2017,36(12):84-89. 被引量：1
10陈嫒嫒,刘光灿.基于低秩正则化的神经网络学习方法[J].计算机工程与设计,2018,39(4):1034-1038. 被引量：1

中国图象图形学报

2019年第5期

浏览历史

内容加载中请稍等...

小数据样本深度迁移网络自发表情分类被引量：9

参考文献2

二级参考文献86

共引文献64

同被引文献66

引证文献9

二级引证文献32

相关作者

相关机构

相关主题

浏览历史

小数据样本深度迁移网络自发表情分类 被引量：9

参考文献2

二级参考文献86

共引文献64

同被引文献66

引证文献9

二级引证文献32

相关作者

相关机构

相关主题

浏览历史

小数据样本深度迁移网络自发表情分类被引量：9