For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The mi...For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The migration learning algorithm is used to pre-train the convolutional layer parameters and mitigate the overfitting caused by the insufficient number of training samples.The designed MCA module is integrated into the ResNet18 backbone network.The attention mechanism highlights important information and suppresses irrelevant information by assigning different coefficients or weights,and the multi-head structure focuses more on the local features of the pictures,which improves the efficiency of facial expression recognition.Experimental results demonstrate that the model proposed in this paper achieves excellent recognition results in Fer2013,CK+and Jaffe datasets,with accuracy rates of 72.7%,98.8%and 93.33%,respectively.展开更多
In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According t...In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.展开更多
A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extr...A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extraction of more discriminative and distinctive deep learning features is achieved using extracted facial regions.To prevent overfitting,in-depth features of facial images are extracted and assigned to the proposed convolutional neural network(CNN)models.Various CNN models are then trained.Finally,the performance of each CNN model is fused to obtain the final decision for the seven basic classes of facial expressions,i.e.,fear,disgust,anger,surprise,sadness,happiness,neutral.For experimental purposes,three benchmark datasets,i.e.,SFEW,CK+,and KDEF are utilized.The performance of the proposed systemis compared with some state-of-the-artmethods concerning each dataset.Extensive performance analysis reveals that the proposed system outperforms the competitive methods in terms of various performance metrics.Finally,the proposed deep fusion model is being utilized to control a music player using the recognized emotions of the users.展开更多
The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characte...The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.展开更多
Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on ha...Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on hand-crafted features,namely,LBP,SIFT,and HOG,along with that a classifier trained on a database of videos or images.Many execute perform well on image datasets captured in a controlled condition;however not perform well in the more challenging dataset,which has partial faces and image variation.Recently,many studies presented an endwise structure for facial expression recognition by utilizing DL methods.Therefore,this study develops an earthworm optimization with an improved SqueezeNet-based FER(EWOISN-FER)model.The presented EWOISN-FER model primarily applies the contrast-limited adaptive histogram equalization(CLAHE)technique as a pre-processing step.In addition,the improved SqueezeNet model is exploited to derive an optimal set of feature vectors,and the hyperparameter tuning process is performed by the stochastic gradient boosting(SGB)model.Finally,EWO with sparse autoencoder(SAE)is employed for the FER process,and the EWO algorithm appropriately chooses the SAE parameters.Awide-ranging experimental analysis is carried out to examine the performance of the proposed model.The experimental outcomes indicate the supremacy of the presented EWOISN-FER technique.展开更多
Facial Expression Recognition(FER)has been an importantfield of research for several decades.Extraction of emotional characteristics is crucial to FERs,but is complex to process as they have significant intra-class va...Facial Expression Recognition(FER)has been an importantfield of research for several decades.Extraction of emotional characteristics is crucial to FERs,but is complex to process as they have significant intra-class variances.Facial characteristics have not been completely explored in static pictures.Previous studies used Convolution Neural Networks(CNNs)based on transfer learning and hyperparameter optimizations for static facial emotional recognitions.Particle Swarm Optimizations(PSOs)have also been used for tuning hyperparameters.However,these methods achieve about 92 percent in terms of accuracy.The existing algorithms have issues with FER accuracy and precision.Hence,the overall FER performance is degraded significantly.To address this issue,this work proposes a combination of CNNs and Long Short-Term Memories(LSTMs)called the HCNN-LSTMs(Hybrid CNNs and LSTMs)approach for FERs.The work is evaluated on the benchmark dataset,Facial Expression Recog Image Ver(FERC).Viola-Jones(VJ)algorithms recognize faces from preprocessed images followed by HCNN-LSTMs feature extractions and FER classifications.Further,the success rate of Deep Learning Techniques(DLTs)has increased with hyperparameter tunings like epochs,batch sizes,initial learning rates,regularization parameters,shuffling types,and momentum.This proposed work uses Improved Weight based Whale Optimization Algorithms(IWWOAs)to select near-optimal settings for these parameters using bestfitness values.The experi-mentalfindings demonstrated that the proposed HCNN-LSTMs system outper-forms the existing methods.展开更多
Analyzing human facial expressions using machine vision systems is indeed a challenging yet fascinating problem in the field of computer vision and artificial intelligence. Facial expressions are a primary means throu...Analyzing human facial expressions using machine vision systems is indeed a challenging yet fascinating problem in the field of computer vision and artificial intelligence. Facial expressions are a primary means through which humans convey emotions, making their automated recognition valuable for various applications including man-computer interaction, affective computing, and psychological research. Pre-processing techniques are applied to every image with the aim of standardizing the images. Frequently used techniques include scaling, blurring, rotating, altering the contour of the image, changing the color to grayscale and normalization. Followed by feature extraction and then the traditional classifiers are applied to infer facial expressions. Increasing the performance of the system is difficult in the typical machine learning approach because feature extraction and classification phases are separate. But in Deep Neural Networks (DNN), the two phases are combined into a single phase. Therefore, the Convolutional Neural Network (CNN) models give better accuracy in Facial Expression Recognition than the traditional classifiers. But still the performance of CNN is hampered by noisy and deviated images in the dataset. This work utilized the preprocessing methods such as resizing, gray-scale conversion and normalization. Also, this research work is motivated by these drawbacks to study the use of image pre-processing techniques to enhance the performance of deep learning methods to implement facial expression recognition. Also, this research aims to recognize emotions using deep learning and show the influences of data pre-processing for further processing of images. The accuracy of each pre-processing methods is compared, then combination between them is analysed and the appropriate preprocessing techniques are identified and implemented to see the variability of accuracies in predicting facial expressions. .展开更多
A novel fuzzy linear discriminant analysis method by the canonical correlation analysis (fuzzy-LDA/CCA)is presented and applied to the facial expression recognition. The fuzzy method is used to evaluate the degree o...A novel fuzzy linear discriminant analysis method by the canonical correlation analysis (fuzzy-LDA/CCA)is presented and applied to the facial expression recognition. The fuzzy method is used to evaluate the degree of the class membership to which each training sample belongs. CCA is then used to establish the relationship between each facial image and the corresponding class membership vector, and the class membership vector of a test image is estimated using this relationship. Moreover, the fuzzy-LDA/CCA method is also generalized to deal with nonlinear discriminant analysis problems via kernel method. The performance of the proposed method is demonstrated using real data.展开更多
It is unknown if the ability of Portuguese in the identification of NimStim data set,which was created in America to provide facial expressions that could be recognized by untrained people,is(or not)similar to the Ame...It is unknown if the ability of Portuguese in the identification of NimStim data set,which was created in America to provide facial expressions that could be recognized by untrained people,is(or not)similar to the Americans.To test this hypothesis the performance of Portuguese in the recognition of Happiness,Surprise,Sadness,Fear,Disgust and Anger NimStim facial expressions was compared with the Americans,but no significant differences were found.In both populations the easiest emotion to identify was Happiness while Fear was the most difficult one.However,with exception for Surprise,Portuguese tend to show a lower accuracy rate for all the emotions studied.Results highlighted some cultural differences.展开更多
Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse...Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse facial features of individual frames.In this paper, a frame-level attention module is integrated into an improved VGG-based frame work and a lightweight facial expression recognition method is proposed.The proposed network takes a sub video cut from an experimental video sequence as its input and generates a fixed-dimension representation.The VGG-based network with an enhanced branch embeds face images into feature vectors.The frame-level attention module learns weights which are used to adaptively aggregate the feature vectors to form a single discriminative video representation.Finally, a regression module outputs the classification results.The experimental results on CK+and AFEW databases show that the recognition rates of the proposed method can achieve the state-of-the-art performance.展开更多
Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a chal...Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a challenge, and inspired by the recent success of deep identity network (DeepID-Net) for face identification, this paper proposes a novel deep learning based framework for recognising human expressions with facial images. Compared to the existing deep learning methods, our proposed framework, which is based on multi-scale global images and local facial patches, can significantly achieve a better performance on facial expression recognition. Finally, we verify the effectiveness of our proposed framework through experiments on the public benchmarking datasets JAFFE and extended Cohn-Kanade (CK+).展开更多
A facial expression emotion recognition based human-robot interaction(FEER-HRI) system is proposed, for which a four-layer system framework is designed. The FEERHRI system enables the robots not only to recognize huma...A facial expression emotion recognition based human-robot interaction(FEER-HRI) system is proposed, for which a four-layer system framework is designed. The FEERHRI system enables the robots not only to recognize human emotions, but also to generate facial expression for adapting to human emotions. A facial emotion recognition method based on2D-Gabor, uniform local binary pattern(LBP) operator, and multiclass extreme learning machine(ELM) classifier is presented,which is applied to real-time facial expression recognition for robots. Facial expressions of robots are represented by simple cartoon symbols and displayed by a LED screen equipped in the robots, which can be easily understood by human. Four scenarios,i.e., guiding, entertainment, home service and scene simulation are performed in the human-robot interaction experiment, in which smooth communication is realized by facial expression recognition of humans and facial expression generation of robots within 2 seconds. As a few prospective applications, the FEERHRI system can be applied in home service, smart home, safe driving, and so on.展开更多
Functional magnetic resonance imaging was used during emotion recognition to identify changes in functional brain activation in 21 first-episode, treatment-naive major depressive disorder patients before and after ant...Functional magnetic resonance imaging was used during emotion recognition to identify changes in functional brain activation in 21 first-episode, treatment-naive major depressive disorder patients before and after antidepressant treatment. Following escitalopram oxalate treatment, patients exhibited decreased activation in bilateral precentral gyrus, bilateral middle frontal gyrus, left middle temporal gyrus, bilateral postcentral gyrus, left cingulate and right parahippocampal gyrus, and increased activation in right superior frontal gyrus, bilateral superior parietal Iobule and left occipital gyrus during sad facial expression recognition. After antidepressant treatment, patients also exhibited decreased activation in the bilateral middle frontal gyrus, bilateral cingulate and right parahippocampal gyrus, and increased activation in the right inferior frontal gyrus, left fusiform gyrus and right precuneus during happy facial expression recognition. Our experimental findings indicate that the limbic-cortical network might be a key target region for antidepressant treatment in major depressive disorder.展开更多
In this paper, a novel method based on dual-tree complex wavelet transform(DT-CWT) and rotation invariant local binary pattern(LBP) for facial expression recognition is proposed. The quarter sample shift (Q-shift) DT-...In this paper, a novel method based on dual-tree complex wavelet transform(DT-CWT) and rotation invariant local binary pattern(LBP) for facial expression recognition is proposed. The quarter sample shift (Q-shift) DT-CWT can provide a group delay of 1/4 of a sample period, and satisfy the usual 2-band filter bank constraints of no aliasing and perfect reconstruction. To resolve illumination variation in expression verification, low-frequency coefficients produced by DT-CWT are set zeroes, high-frequency coefficients are used for reconstructing the image, and basic LBP histogram is mapped on the reconstructed image by means of histogram specification. LBP is capable of encoding texture and shape information of the preprocessed images. The histogram graphs built from multi-scale rotation invariant LBPs are combined to serve as feature for further recognition. Template matching is adopted to classify facial expressions for its simplicity. The experimental results show that the proposed approach has good performance in efficiency and accuracy.展开更多
Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER hav...Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.展开更多
AIM: To conduct a systematic literature review about the influence of gender on the recognition of facial expressions of six basic emotions. METHODS: We made a systematic search with the search terms(face OR facial) A...AIM: To conduct a systematic literature review about the influence of gender on the recognition of facial expressions of six basic emotions. METHODS: We made a systematic search with the search terms(face OR facial) AND(processing OR recognition OR perception) AND(emotional OR emotion) AND(gender or sex) in Pub Med, Psyc INFO, LILACS, and Sci ELO electronic databases for articles assessing outcomes related to response accuracy and latency and emotional intensity. The articles selection was performed according to parameters set by COCHRANE. The reference lists of the articles found through the database search were checked for additional references of interest. RESULTS: In respect to accuracy, women tend to perform better than men when all emotions are considered as a set. Regarding specific emotions, there seems to be no gender-related differences in the recognition of happiness, whereas results are quite heterogeneous in respect to the remaining emotions, especially sadness, anger, and disgust. Fewer articles dealt with the parameters of response latency and emotional intensity, which hinders the generalization of their findings, especially in the face of their methodological differences. CONCLUSION: The analysis of the studies conducted to date do not allow for definite conclusions concerning the role of the observer's gender in the recognition of facial emotion, mostly because of the absence of standardized methods of investigation.展开更多
Herein,a three-stage support vector machine(SVM)for facial expression recognition is proposed.The first stage comprises 21 SVMs,which are all the binary combinations of seven expressions.If one expression is dominant,...Herein,a three-stage support vector machine(SVM)for facial expression recognition is proposed.The first stage comprises 21 SVMs,which are all the binary combinations of seven expressions.If one expression is dominant,then the first stage will suffice;if two are dominant,then the second stage is used;and,if three are dominant,the third stage is used.These multilevel stages help reduce the possibility of experiencing an error as much as possible.Different image preprocessing stages are used to ensure that the features attained from the face detected have a meaningful and proper contribution to the classification stage.Facial expressions are created as a result of muscle movements on the face.These subtle movements are detected by the histogram-oriented gradient feature,because it is sensitive to the shapes of objects.The features attained are then used to train the three-stage SVM.Two different validation methods were used:the leave-one-out and K-fold tests.Experimental results on three databases(Japanese Female Facial Expression,Extended Cohn-Kanade Dataset,and Radboud Faces Database)show that the proposed system is competitive and has better performance compared with other works.展开更多
The facial expression recognition systn using the Ariaboost based on the Split Rectangle feature is proposed in this paper. This system provides more various featmes in increasing speed and accuracy than the Haarolike...The facial expression recognition systn using the Ariaboost based on the Split Rectangle feature is proposed in this paper. This system provides more various featmes in increasing speed and accuracy than the Haarolike featrue of Viola, which is commonly used for the Adaboost training algorithm. The Split Rectangle feature uses the nmsk-like shape composed with 2 independent rectangles, instead of using mask-like shape of Haar-like feature, which is composed of 2 --4 adhered rectangles of Viola. Split Rectangle feature has less di- verged operation than the Haar-like feaze. It also requires less oper- ation because the stun of pixels requires ordy two rectangles. Split Rectangle feature provides various and fast features to the Adaboost, which produrces the strong classifier with increased accuracy and speed. In the experiment, the system had 5.92 ms performance speed and 84 %--94 % accuracy by leaming 5 facial expressions, neutral, happiness, sadness, anger and surprise with the use of the Adaboost based on the Split Rectangle feature.展开更多
As a key link in human-computer interaction,emotion recognition can enable robots to correctly perceive user emotions and provide dynamic and adjustable services according to the emotional needs of different users,whi...As a key link in human-computer interaction,emotion recognition can enable robots to correctly perceive user emotions and provide dynamic and adjustable services according to the emotional needs of different users,which is the key to improve the cognitive level of robot service.Emotion recognition based on facial expression and electrocardiogram has numerous industrial applications.First,three-dimensional convolutional neural network deep learning architecture is utilized to extract the spatial and temporal features from facial expression video data and electrocardiogram(ECG)data,and emotion classification is carried out.Then two modalities are fused in the data level and the decision level,respectively,and the emotion recognition results are then given.Finally,the emotion recognition results of single-modality and multi-modality are compared and analyzed.Through the comparative analysis of the experimental results of single-modality and multi-modality under the two fusion methods,it is concluded that the accuracy rate of multi-modal emotion recognition is greatly improved compared with that of single-modal emotion recognition,and decision-level fusion is easier to operate and more effective than data-level fusion.展开更多
The recent boom of mass media communication (such as social media and mobiles) has boosted more applications of automatic facial expression recognition (FER). Thus, human facial expressions have to be encoded and reco...The recent boom of mass media communication (such as social media and mobiles) has boosted more applications of automatic facial expression recognition (FER). Thus, human facial expressions have to be encoded and recognized through digital devices. However, this process has to be done under recurrent problems of image illumination changes and partial occlusions. Therefore, in this paper, we propose a fully automated FER system based on Local Fourier Coefficients and Facial Fourier Descriptors. The combined power of appearance and geometric features is used for describing the specific facial regions of eyes-eyebrows, nose and mouth. All based on the attributes of the Fourier Transform and Support Vector Machines. Hence, our proposal overcomes FER problems such as illumination changes, partial occlusion, image rotation, redundancy and dimensionality reduction. Several tests were performed in order to demonstrate the efficiency of our proposal, which were evaluated using three standard databases: CK+, MUG and TFEID. In addition, evaluation results showed that the average recognition rate of each database reaches higher performance than most of the state-of-the-art techniques surveyed in this paper.展开更多
基金funded by Anhui Province Quality Engineering Project No.2021jyxm0801Natural Science Foundation of Anhui University of Chinese Medicine under Grant Nos.2020zrzd18,2019zrzd11+1 种基金Humanity Social Science foundation Grants 2021rwzd20,2020rwzd07Anhui University of Chinese Medicine Quality Engineering Projects No.2021zlgc046.
文摘For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The migration learning algorithm is used to pre-train the convolutional layer parameters and mitigate the overfitting caused by the insufficient number of training samples.The designed MCA module is integrated into the ResNet18 backbone network.The attention mechanism highlights important information and suppresses irrelevant information by assigning different coefficients or weights,and the multi-head structure focuses more on the local features of the pictures,which improves the efficiency of facial expression recognition.Experimental results demonstrate that the model proposed in this paper achieves excellent recognition results in Fer2013,CK+and Jaffe datasets,with accuracy rates of 72.7%,98.8%and 93.33%,respectively.
文摘In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.
基金supported by the Researchers Supporting Project (No.RSP-2021/395),King Saud University,Riyadh,Saudi Arabia.
文摘A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extraction of more discriminative and distinctive deep learning features is achieved using extracted facial regions.To prevent overfitting,in-depth features of facial images are extracted and assigned to the proposed convolutional neural network(CNN)models.Various CNN models are then trained.Finally,the performance of each CNN model is fused to obtain the final decision for the seven basic classes of facial expressions,i.e.,fear,disgust,anger,surprise,sadness,happiness,neutral.For experimental purposes,three benchmark datasets,i.e.,SFEW,CK+,and KDEF are utilized.The performance of the proposed systemis compared with some state-of-the-artmethods concerning each dataset.Extensive performance analysis reveals that the proposed system outperforms the competitive methods in terms of various performance metrics.Finally,the proposed deep fusion model is being utilized to control a music player using the recognized emotions of the users.
基金This work is supported by Natural Science Foundation of China(Grant No.61903056)Major Project of Science and Technology Research Program of Chongqing Education Commission of China(Grant No.KJZDM201900601)+3 种基金Chongqing Research Program of Basic Research and Frontier Technology(Grant Nos.cstc2019jcyj-msxmX0681,cstc2021jcyj-msxmX0530,and cstc2021jcyjmsxmX0761)Project Supported by Chongqing Municipal Key Laboratory of Institutions of Higher Education(Grant No.cqupt-mct-201901)Project Supported by Chongqing Key Laboratory of Mobile Communications Technology(Grant No.cqupt-mct-202002)Project Supported by Engineering Research Center of Mobile Communications,Ministry of Education(Grant No.cqupt-mct202006)。
文摘The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.
文摘Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on hand-crafted features,namely,LBP,SIFT,and HOG,along with that a classifier trained on a database of videos or images.Many execute perform well on image datasets captured in a controlled condition;however not perform well in the more challenging dataset,which has partial faces and image variation.Recently,many studies presented an endwise structure for facial expression recognition by utilizing DL methods.Therefore,this study develops an earthworm optimization with an improved SqueezeNet-based FER(EWOISN-FER)model.The presented EWOISN-FER model primarily applies the contrast-limited adaptive histogram equalization(CLAHE)technique as a pre-processing step.In addition,the improved SqueezeNet model is exploited to derive an optimal set of feature vectors,and the hyperparameter tuning process is performed by the stochastic gradient boosting(SGB)model.Finally,EWO with sparse autoencoder(SAE)is employed for the FER process,and the EWO algorithm appropriately chooses the SAE parameters.Awide-ranging experimental analysis is carried out to examine the performance of the proposed model.The experimental outcomes indicate the supremacy of the presented EWOISN-FER technique.
文摘Facial Expression Recognition(FER)has been an importantfield of research for several decades.Extraction of emotional characteristics is crucial to FERs,but is complex to process as they have significant intra-class variances.Facial characteristics have not been completely explored in static pictures.Previous studies used Convolution Neural Networks(CNNs)based on transfer learning and hyperparameter optimizations for static facial emotional recognitions.Particle Swarm Optimizations(PSOs)have also been used for tuning hyperparameters.However,these methods achieve about 92 percent in terms of accuracy.The existing algorithms have issues with FER accuracy and precision.Hence,the overall FER performance is degraded significantly.To address this issue,this work proposes a combination of CNNs and Long Short-Term Memories(LSTMs)called the HCNN-LSTMs(Hybrid CNNs and LSTMs)approach for FERs.The work is evaluated on the benchmark dataset,Facial Expression Recog Image Ver(FERC).Viola-Jones(VJ)algorithms recognize faces from preprocessed images followed by HCNN-LSTMs feature extractions and FER classifications.Further,the success rate of Deep Learning Techniques(DLTs)has increased with hyperparameter tunings like epochs,batch sizes,initial learning rates,regularization parameters,shuffling types,and momentum.This proposed work uses Improved Weight based Whale Optimization Algorithms(IWWOAs)to select near-optimal settings for these parameters using bestfitness values.The experi-mentalfindings demonstrated that the proposed HCNN-LSTMs system outper-forms the existing methods.
文摘Analyzing human facial expressions using machine vision systems is indeed a challenging yet fascinating problem in the field of computer vision and artificial intelligence. Facial expressions are a primary means through which humans convey emotions, making their automated recognition valuable for various applications including man-computer interaction, affective computing, and psychological research. Pre-processing techniques are applied to every image with the aim of standardizing the images. Frequently used techniques include scaling, blurring, rotating, altering the contour of the image, changing the color to grayscale and normalization. Followed by feature extraction and then the traditional classifiers are applied to infer facial expressions. Increasing the performance of the system is difficult in the typical machine learning approach because feature extraction and classification phases are separate. But in Deep Neural Networks (DNN), the two phases are combined into a single phase. Therefore, the Convolutional Neural Network (CNN) models give better accuracy in Facial Expression Recognition than the traditional classifiers. But still the performance of CNN is hampered by noisy and deviated images in the dataset. This work utilized the preprocessing methods such as resizing, gray-scale conversion and normalization. Also, this research work is motivated by these drawbacks to study the use of image pre-processing techniques to enhance the performance of deep learning methods to implement facial expression recognition. Also, this research aims to recognize emotions using deep learning and show the influences of data pre-processing for further processing of images. The accuracy of each pre-processing methods is compared, then combination between them is analysed and the appropriate preprocessing techniques are identified and implemented to see the variability of accuracies in predicting facial expressions. .
基金The National Natural Science Foundation of China (No.60503023,60872160)the Natural Science Foundation for Universities ofJiangsu Province (No.08KJD520009)the Intramural Research Foundationof Nanjing University of Information Science and Technology(No.Y603)
文摘A novel fuzzy linear discriminant analysis method by the canonical correlation analysis (fuzzy-LDA/CCA)is presented and applied to the facial expression recognition. The fuzzy method is used to evaluate the degree of the class membership to which each training sample belongs. CCA is then used to establish the relationship between each facial image and the corresponding class membership vector, and the class membership vector of a test image is estimated using this relationship. Moreover, the fuzzy-LDA/CCA method is also generalized to deal with nonlinear discriminant analysis problems via kernel method. The performance of the proposed method is demonstrated using real data.
文摘It is unknown if the ability of Portuguese in the identification of NimStim data set,which was created in America to provide facial expressions that could be recognized by untrained people,is(or not)similar to the Americans.To test this hypothesis the performance of Portuguese in the recognition of Happiness,Surprise,Sadness,Fear,Disgust and Anger NimStim facial expressions was compared with the Americans,but no significant differences were found.In both populations the easiest emotion to identify was Happiness while Fear was the most difficult one.However,with exception for Surprise,Portuguese tend to show a lower accuracy rate for all the emotions studied.Results highlighted some cultural differences.
基金Supported by the Future Network Scientific Research Fund Project of Jiangsu Province (No. FNSRFP2021YB26)the Jiangsu Key R&D Fund on Social Development (No. BE2022789)the Science Foundation of Nanjing Institute of Technology (No. ZKJ202003)。
文摘Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse facial features of individual frames.In this paper, a frame-level attention module is integrated into an improved VGG-based frame work and a lightweight facial expression recognition method is proposed.The proposed network takes a sub video cut from an experimental video sequence as its input and generates a fixed-dimension representation.The VGG-based network with an enhanced branch embeds face images into feature vectors.The frame-level attention module learns weights which are used to adaptively aggregate the feature vectors to form a single discriminative video representation.Finally, a regression module outputs the classification results.The experimental results on CK+and AFEW databases show that the recognition rates of the proposed method can achieve the state-of-the-art performance.
基金supported by the Academy of Finland(267581)the D2I SHOK Project from Digile Oy as well as Nokia Technologies(Tampere,Finland)
文摘Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a challenge, and inspired by the recent success of deep identity network (DeepID-Net) for face identification, this paper proposes a novel deep learning based framework for recognising human expressions with facial images. Compared to the existing deep learning methods, our proposed framework, which is based on multi-scale global images and local facial patches, can significantly achieve a better performance on facial expression recognition. Finally, we verify the effectiveness of our proposed framework through experiments on the public benchmarking datasets JAFFE and extended Cohn-Kanade (CK+).
基金supported by the National Natural Science Foundation of China(61403422,61273102)the Hubei Provincial Natural Science Foundation of China(2015CFA010)+1 种基金the Ⅲ Project(B17040)the Fundamental Research Funds for National University,China University of Geosciences(Wuhan)
文摘A facial expression emotion recognition based human-robot interaction(FEER-HRI) system is proposed, for which a four-layer system framework is designed. The FEERHRI system enables the robots not only to recognize human emotions, but also to generate facial expression for adapting to human emotions. A facial emotion recognition method based on2D-Gabor, uniform local binary pattern(LBP) operator, and multiclass extreme learning machine(ELM) classifier is presented,which is applied to real-time facial expression recognition for robots. Facial expressions of robots are represented by simple cartoon symbols and displayed by a LED screen equipped in the robots, which can be easily understood by human. Four scenarios,i.e., guiding, entertainment, home service and scene simulation are performed in the human-robot interaction experiment, in which smooth communication is realized by facial expression recognition of humans and facial expression generation of robots within 2 seconds. As a few prospective applications, the FEERHRI system can be applied in home service, smart home, safe driving, and so on.
基金supported by research grants from the National Natural Science Foundation of China (No. 81071099)the Liaoning Science and Technology Foundation (No. 2008225010-14)Doctoral Foundation of the First Affiliated Hospital in China Medical University (No. 2010)
文摘Functional magnetic resonance imaging was used during emotion recognition to identify changes in functional brain activation in 21 first-episode, treatment-naive major depressive disorder patients before and after antidepressant treatment. Following escitalopram oxalate treatment, patients exhibited decreased activation in bilateral precentral gyrus, bilateral middle frontal gyrus, left middle temporal gyrus, bilateral postcentral gyrus, left cingulate and right parahippocampal gyrus, and increased activation in right superior frontal gyrus, bilateral superior parietal Iobule and left occipital gyrus during sad facial expression recognition. After antidepressant treatment, patients also exhibited decreased activation in the bilateral middle frontal gyrus, bilateral cingulate and right parahippocampal gyrus, and increased activation in the right inferior frontal gyrus, left fusiform gyrus and right precuneus during happy facial expression recognition. Our experimental findings indicate that the limbic-cortical network might be a key target region for antidepressant treatment in major depressive disorder.
文摘In this paper, a novel method based on dual-tree complex wavelet transform(DT-CWT) and rotation invariant local binary pattern(LBP) for facial expression recognition is proposed. The quarter sample shift (Q-shift) DT-CWT can provide a group delay of 1/4 of a sample period, and satisfy the usual 2-band filter bank constraints of no aliasing and perfect reconstruction. To resolve illumination variation in expression verification, low-frequency coefficients produced by DT-CWT are set zeroes, high-frequency coefficients are used for reconstructing the image, and basic LBP histogram is mapped on the reconstructed image by means of histogram specification. LBP is capable of encoding texture and shape information of the preprocessed images. The histogram graphs built from multi-scale rotation invariant LBPs are combined to serve as feature for further recognition. Template matching is adopted to classify facial expressions for its simplicity. The experimental results show that the proposed approach has good performance in efficiency and accuracy.
文摘Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.
基金Supported by FAPESP-Fundacao de Amparo à Pesquisa do Estado de Sao Paulo,No.2012/02260-7
文摘AIM: To conduct a systematic literature review about the influence of gender on the recognition of facial expressions of six basic emotions. METHODS: We made a systematic search with the search terms(face OR facial) AND(processing OR recognition OR perception) AND(emotional OR emotion) AND(gender or sex) in Pub Med, Psyc INFO, LILACS, and Sci ELO electronic databases for articles assessing outcomes related to response accuracy and latency and emotional intensity. The articles selection was performed according to parameters set by COCHRANE. The reference lists of the articles found through the database search were checked for additional references of interest. RESULTS: In respect to accuracy, women tend to perform better than men when all emotions are considered as a set. Regarding specific emotions, there seems to be no gender-related differences in the recognition of happiness, whereas results are quite heterogeneous in respect to the remaining emotions, especially sadness, anger, and disgust. Fewer articles dealt with the parameters of response latency and emotional intensity, which hinders the generalization of their findings, especially in the face of their methodological differences. CONCLUSION: The analysis of the studies conducted to date do not allow for definite conclusions concerning the role of the observer's gender in the recognition of facial emotion, mostly because of the absence of standardized methods of investigation.
文摘Herein,a three-stage support vector machine(SVM)for facial expression recognition is proposed.The first stage comprises 21 SVMs,which are all the binary combinations of seven expressions.If one expression is dominant,then the first stage will suffice;if two are dominant,then the second stage is used;and,if three are dominant,the third stage is used.These multilevel stages help reduce the possibility of experiencing an error as much as possible.Different image preprocessing stages are used to ensure that the features attained from the face detected have a meaningful and proper contribution to the classification stage.Facial expressions are created as a result of muscle movements on the face.These subtle movements are detected by the histogram-oriented gradient feature,because it is sensitive to the shapes of objects.The features attained are then used to train the three-stage SVM.Two different validation methods were used:the leave-one-out and K-fold tests.Experimental results on three databases(Japanese Female Facial Expression,Extended Cohn-Kanade Dataset,and Radboud Faces Database)show that the proposed system is competitive and has better performance compared with other works.
基金supported by the Brain Korea 21 Project in2010,the MKE(The Ministry of Knowledge Economy),Koreathe ITRC(Information Technology Research Center)support programsupervised by the NIPA(National ITIndustry Promotion Agency)(NI-PA-2010-(C1090-1021-0010))
文摘The facial expression recognition systn using the Ariaboost based on the Split Rectangle feature is proposed in this paper. This system provides more various featmes in increasing speed and accuracy than the Haarolike featrue of Viola, which is commonly used for the Adaboost training algorithm. The Split Rectangle feature uses the nmsk-like shape composed with 2 independent rectangles, instead of using mask-like shape of Haar-like feature, which is composed of 2 --4 adhered rectangles of Viola. Split Rectangle feature has less di- verged operation than the Haar-like feaze. It also requires less oper- ation because the stun of pixels requires ordy two rectangles. Split Rectangle feature provides various and fast features to the Adaboost, which produrces the strong classifier with increased accuracy and speed. In the experiment, the system had 5.92 ms performance speed and 84 %--94 % accuracy by leaming 5 facial expressions, neutral, happiness, sadness, anger and surprise with the use of the Adaboost based on the Split Rectangle feature.
基金supported by the Open Funding Project of National Key Laboratory of Human Factors Engineering(Grant NO.6142222190309)。
文摘As a key link in human-computer interaction,emotion recognition can enable robots to correctly perceive user emotions and provide dynamic and adjustable services according to the emotional needs of different users,which is the key to improve the cognitive level of robot service.Emotion recognition based on facial expression and electrocardiogram has numerous industrial applications.First,three-dimensional convolutional neural network deep learning architecture is utilized to extract the spatial and temporal features from facial expression video data and electrocardiogram(ECG)data,and emotion classification is carried out.Then two modalities are fused in the data level and the decision level,respectively,and the emotion recognition results are then given.Finally,the emotion recognition results of single-modality and multi-modality are compared and analyzed.Through the comparative analysis of the experimental results of single-modality and multi-modality under the two fusion methods,it is concluded that the accuracy rate of multi-modal emotion recognition is greatly improved compared with that of single-modal emotion recognition,and decision-level fusion is easier to operate and more effective than data-level fusion.
文摘The recent boom of mass media communication (such as social media and mobiles) has boosted more applications of automatic facial expression recognition (FER). Thus, human facial expressions have to be encoded and recognized through digital devices. However, this process has to be done under recurrent problems of image illumination changes and partial occlusions. Therefore, in this paper, we propose a fully automated FER system based on Local Fourier Coefficients and Facial Fourier Descriptors. The combined power of appearance and geometric features is used for describing the specific facial regions of eyes-eyebrows, nose and mouth. All based on the attributes of the Fourier Transform and Support Vector Machines. Hence, our proposal overcomes FER problems such as illumination changes, partial occlusion, image rotation, redundancy and dimensionality reduction. Several tests were performed in order to demonstrate the efficiency of our proposal, which were evaluated using three standard databases: CK+, MUG and TFEID. In addition, evaluation results showed that the average recognition rate of each database reaches higher performance than most of the state-of-the-art techniques surveyed in this paper.