For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The mi...For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The migration learning algorithm is used to pre-train the convolutional layer parameters and mitigate the overfitting caused by the insufficient number of training samples.The designed MCA module is integrated into the ResNet18 backbone network.The attention mechanism highlights important information and suppresses irrelevant information by assigning different coefficients or weights,and the multi-head structure focuses more on the local features of the pictures,which improves the efficiency of facial expression recognition.Experimental results demonstrate that the model proposed in this paper achieves excellent recognition results in Fer2013,CK+and Jaffe datasets,with accuracy rates of 72.7%,98.8%and 93.33%,respectively.展开更多
In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According t...In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.展开更多
The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characte...The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.展开更多
Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on ha...Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on hand-crafted features,namely,LBP,SIFT,and HOG,along with that a classifier trained on a database of videos or images.Many execute perform well on image datasets captured in a controlled condition;however not perform well in the more challenging dataset,which has partial faces and image variation.Recently,many studies presented an endwise structure for facial expression recognition by utilizing DL methods.Therefore,this study develops an earthworm optimization with an improved SqueezeNet-based FER(EWOISN-FER)model.The presented EWOISN-FER model primarily applies the contrast-limited adaptive histogram equalization(CLAHE)technique as a pre-processing step.In addition,the improved SqueezeNet model is exploited to derive an optimal set of feature vectors,and the hyperparameter tuning process is performed by the stochastic gradient boosting(SGB)model.Finally,EWO with sparse autoencoder(SAE)is employed for the FER process,and the EWO algorithm appropriately chooses the SAE parameters.Awide-ranging experimental analysis is carried out to examine the performance of the proposed model.The experimental outcomes indicate the supremacy of the presented EWOISN-FER technique.展开更多
Facial Expression Recognition(FER)has been an importantfield of research for several decades.Extraction of emotional characteristics is crucial to FERs,but is complex to process as they have significant intra-class va...Facial Expression Recognition(FER)has been an importantfield of research for several decades.Extraction of emotional characteristics is crucial to FERs,but is complex to process as they have significant intra-class variances.Facial characteristics have not been completely explored in static pictures.Previous studies used Convolution Neural Networks(CNNs)based on transfer learning and hyperparameter optimizations for static facial emotional recognitions.Particle Swarm Optimizations(PSOs)have also been used for tuning hyperparameters.However,these methods achieve about 92 percent in terms of accuracy.The existing algorithms have issues with FER accuracy and precision.Hence,the overall FER performance is degraded significantly.To address this issue,this work proposes a combination of CNNs and Long Short-Term Memories(LSTMs)called the HCNN-LSTMs(Hybrid CNNs and LSTMs)approach for FERs.The work is evaluated on the benchmark dataset,Facial Expression Recog Image Ver(FERC).Viola-Jones(VJ)algorithms recognize faces from preprocessed images followed by HCNN-LSTMs feature extractions and FER classifications.Further,the success rate of Deep Learning Techniques(DLTs)has increased with hyperparameter tunings like epochs,batch sizes,initial learning rates,regularization parameters,shuffling types,and momentum.This proposed work uses Improved Weight based Whale Optimization Algorithms(IWWOAs)to select near-optimal settings for these parameters using bestfitness values.The experi-mentalfindings demonstrated that the proposed HCNN-LSTMs system outper-forms the existing methods.展开更多
Analyzing human facial expressions using machine vision systems is indeed a challenging yet fascinating problem in the field of computer vision and artificial intelligence. Facial expressions are a primary means throu...Analyzing human facial expressions using machine vision systems is indeed a challenging yet fascinating problem in the field of computer vision and artificial intelligence. Facial expressions are a primary means through which humans convey emotions, making their automated recognition valuable for various applications including man-computer interaction, affective computing, and psychological research. Pre-processing techniques are applied to every image with the aim of standardizing the images. Frequently used techniques include scaling, blurring, rotating, altering the contour of the image, changing the color to grayscale and normalization. Followed by feature extraction and then the traditional classifiers are applied to infer facial expressions. Increasing the performance of the system is difficult in the typical machine learning approach because feature extraction and classification phases are separate. But in Deep Neural Networks (DNN), the two phases are combined into a single phase. Therefore, the Convolutional Neural Network (CNN) models give better accuracy in Facial Expression Recognition than the traditional classifiers. But still the performance of CNN is hampered by noisy and deviated images in the dataset. This work utilized the preprocessing methods such as resizing, gray-scale conversion and normalization. Also, this research work is motivated by these drawbacks to study the use of image pre-processing techniques to enhance the performance of deep learning methods to implement facial expression recognition. Also, this research aims to recognize emotions using deep learning and show the influences of data pre-processing for further processing of images. The accuracy of each pre-processing methods is compared, then combination between them is analysed and the appropriate preprocessing techniques are identified and implemented to see the variability of accuracies in predicting facial expressions. .展开更多
A novel fuzzy linear discriminant analysis method by the canonical correlation analysis (fuzzy-LDA/CCA)is presented and applied to the facial expression recognition. The fuzzy method is used to evaluate the degree o...A novel fuzzy linear discriminant analysis method by the canonical correlation analysis (fuzzy-LDA/CCA)is presented and applied to the facial expression recognition. The fuzzy method is used to evaluate the degree of the class membership to which each training sample belongs. CCA is then used to establish the relationship between each facial image and the corresponding class membership vector, and the class membership vector of a test image is estimated using this relationship. Moreover, the fuzzy-LDA/CCA method is also generalized to deal with nonlinear discriminant analysis problems via kernel method. The performance of the proposed method is demonstrated using real data.展开更多
Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a chal...Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a challenge, and inspired by the recent success of deep identity network (DeepID-Net) for face identification, this paper proposes a novel deep learning based framework for recognising human expressions with facial images. Compared to the existing deep learning methods, our proposed framework, which is based on multi-scale global images and local facial patches, can significantly achieve a better performance on facial expression recognition. Finally, we verify the effectiveness of our proposed framework through experiments on the public benchmarking datasets JAFFE and extended Cohn-Kanade (CK+).展开更多
Functional magnetic resonance imaging was used during emotion recognition to identify changes in functional brain activation in 21 first-episode, treatment-naive major depressive disorder patients before and after ant...Functional magnetic resonance imaging was used during emotion recognition to identify changes in functional brain activation in 21 first-episode, treatment-naive major depressive disorder patients before and after antidepressant treatment. Following escitalopram oxalate treatment, patients exhibited decreased activation in bilateral precentral gyrus, bilateral middle frontal gyrus, left middle temporal gyrus, bilateral postcentral gyrus, left cingulate and right parahippocampal gyrus, and increased activation in right superior frontal gyrus, bilateral superior parietal Iobule and left occipital gyrus during sad facial expression recognition. After antidepressant treatment, patients also exhibited decreased activation in the bilateral middle frontal gyrus, bilateral cingulate and right parahippocampal gyrus, and increased activation in the right inferior frontal gyrus, left fusiform gyrus and right precuneus during happy facial expression recognition. Our experimental findings indicate that the limbic-cortical network might be a key target region for antidepressant treatment in major depressive disorder.展开更多
In this paper, a novel method based on dual-tree complex wavelet transform(DT-CWT) and rotation invariant local binary pattern(LBP) for facial expression recognition is proposed. The quarter sample shift (Q-shift) DT-...In this paper, a novel method based on dual-tree complex wavelet transform(DT-CWT) and rotation invariant local binary pattern(LBP) for facial expression recognition is proposed. The quarter sample shift (Q-shift) DT-CWT can provide a group delay of 1/4 of a sample period, and satisfy the usual 2-band filter bank constraints of no aliasing and perfect reconstruction. To resolve illumination variation in expression verification, low-frequency coefficients produced by DT-CWT are set zeroes, high-frequency coefficients are used for reconstructing the image, and basic LBP histogram is mapped on the reconstructed image by means of histogram specification. LBP is capable of encoding texture and shape information of the preprocessed images. The histogram graphs built from multi-scale rotation invariant LBPs are combined to serve as feature for further recognition. Template matching is adopted to classify facial expressions for its simplicity. The experimental results show that the proposed approach has good performance in efficiency and accuracy.展开更多
Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER hav...Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.展开更多
Herein,a three-stage support vector machine(SVM)for facial expression recognition is proposed.The first stage comprises 21 SVMs,which are all the binary combinations of seven expressions.If one expression is dominant,...Herein,a three-stage support vector machine(SVM)for facial expression recognition is proposed.The first stage comprises 21 SVMs,which are all the binary combinations of seven expressions.If one expression is dominant,then the first stage will suffice;if two are dominant,then the second stage is used;and,if three are dominant,the third stage is used.These multilevel stages help reduce the possibility of experiencing an error as much as possible.Different image preprocessing stages are used to ensure that the features attained from the face detected have a meaningful and proper contribution to the classification stage.Facial expressions are created as a result of muscle movements on the face.These subtle movements are detected by the histogram-oriented gradient feature,because it is sensitive to the shapes of objects.The features attained are then used to train the three-stage SVM.Two different validation methods were used:the leave-one-out and K-fold tests.Experimental results on three databases(Japanese Female Facial Expression,Extended Cohn-Kanade Dataset,and Radboud Faces Database)show that the proposed system is competitive and has better performance compared with other works.展开更多
The facial expression recognition systn using the Ariaboost based on the Split Rectangle feature is proposed in this paper. This system provides more various featmes in increasing speed and accuracy than the Haarolike...The facial expression recognition systn using the Ariaboost based on the Split Rectangle feature is proposed in this paper. This system provides more various featmes in increasing speed and accuracy than the Haarolike featrue of Viola, which is commonly used for the Adaboost training algorithm. The Split Rectangle feature uses the nmsk-like shape composed with 2 independent rectangles, instead of using mask-like shape of Haar-like feature, which is composed of 2 --4 adhered rectangles of Viola. Split Rectangle feature has less di- verged operation than the Haar-like feaze. It also requires less oper- ation because the stun of pixels requires ordy two rectangles. Split Rectangle feature provides various and fast features to the Adaboost, which produrces the strong classifier with increased accuracy and speed. In the experiment, the system had 5.92 ms performance speed and 84 %--94 % accuracy by leaming 5 facial expressions, neutral, happiness, sadness, anger and surprise with the use of the Adaboost based on the Split Rectangle feature.展开更多
A new algorithm taking the spatial context of local features into account by utilizing contextualized histograms was proposed to recognize facial expression. The contextualized histograms were extracted fromtwo widely...A new algorithm taking the spatial context of local features into account by utilizing contextualized histograms was proposed to recognize facial expression. The contextualized histograms were extracted fromtwo widely used descriptors—the local binary pattern( LBP) and weber local descriptor( WLD). The LBP and WLD feature histograms were extracted separately fromeach facial image,and contextualized histogram was generated as feature vectors to feed the classifier. In addition,the human face was divided into sub-blocks and each sub-block was assigned different weights by their different contributions to the intensity of facial expressions to improve the recognition rate. With the support vector machine(SVM) as classifier,the experimental results on the 2D texture images fromthe 3D-BU FE dataset indicated that contextualized histograms improved facial expression recognition performance when local features were employed.展开更多
It is unknown if the ability of Portuguese in the identification of NimStim data set,which was created in America to provide facial expressions that could be recognized by untrained people,is(or not)similar to the Ame...It is unknown if the ability of Portuguese in the identification of NimStim data set,which was created in America to provide facial expressions that could be recognized by untrained people,is(or not)similar to the Americans.To test this hypothesis the performance of Portuguese in the recognition of Happiness,Surprise,Sadness,Fear,Disgust and Anger NimStim facial expressions was compared with the Americans,but no significant differences were found.In both populations the easiest emotion to identify was Happiness while Fear was the most difficult one.However,with exception for Surprise,Portuguese tend to show a lower accuracy rate for all the emotions studied.Results highlighted some cultural differences.展开更多
In expression recognition, feature representation is critical for successful recognition since it contains distinctive information of expressions. In this paper, a new approach for representing facial expression featu...In expression recognition, feature representation is critical for successful recognition since it contains distinctive information of expressions. In this paper, a new approach for representing facial expression features is proposed with its objective to describe features in an effective and efficient way in order to improve the recognition performance. The method combines the facial action coding system(FACS) and 'uniform' local binary patterns(LBP) to represent facial expression features from coarse to fine. The facial feature regions are extracted by active shape models(ASM) based on FACS to obtain the gray-level texture. Then, LBP is used to represent expression features for enhancing the discriminant. A facial expression recognition system is developed based on this feature extraction method by using K nearest neighborhood(K-NN) classifier to recognize facial expressions. Finally, experiments are carried out to evaluate this feature extraction method. The significance of removing the unrelated facial regions and enhancing the discrimination ability of expression features in the recognition process is indicated by the results, in addition to its convenience.展开更多
To overcome the disadvantage of classical recognition model that cannot perform well enough when there are some noises or lost frames in expression image sequences, a novel model called fuzzy buried Markov model (FBM...To overcome the disadvantage of classical recognition model that cannot perform well enough when there are some noises or lost frames in expression image sequences, a novel model called fuzzy buried Markov model (FBMM) is presented in this paper. FBMM relaxes conditional independence assumptions for classical hidden Markov model (HMM) by adding the specific cross-observation dependencies between observation elements. Compared with buried Markov model (BMM), FBMM utilizes cloud distribution to replace probability distribution to describe state transition and observation symbol generation and adopts maximum mutual information (MMI) method to replace maximum likelihood (ML) method to estimate parameters. Theoretical justifications and experimental results verify higher recognition rate and stronger robustness of facial expression recognition for image sequences based on FBMM than those of HMM and BMM.展开更多
Automatic facial expression recognition (FER) from non-frontal views is a challenging research topic which has recently started to attract the attention of the research community. Pose variations are difficult to ta...Automatic facial expression recognition (FER) from non-frontal views is a challenging research topic which has recently started to attract the attention of the research community. Pose variations are difficult to tackle and many face analysis methods require the use of sophisticated nor- malization and initialization procedures. Thus head-pose in- variant facial expression recognition continues to be an is- sue to traditional methods. In this paper, we propose a novel approach for pose-invariant FER based on pose-robust fea- tures which are learned by deep learning methods -- prin- cipal component analysis network (PCANet) and convolu- tional neural networks (CNN) (PRP-CNN). In the first stage, unlabeled frontal face images are used to learn features by PCANet. The features, in the second stage, are used as the tar- get of CNN to learn a feature mapping between frontal faces and non-frontal faces. We then describe the non-frontal face images using the novel descriptions generated by the maps, and get unified descriptors for arbitrary face images. Finally, the pose-robust features are used to train a single classifier for FER instead of training multiple models for each spe- cific pose. Our method, on the whole, does not require pose/ landmark annotation and can recognize facial expression in a wide range of orientations. Extensive experiments on two public databases show that our framework yields dramatic improvements in facial expression analysis.展开更多
Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse...Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse facial features of individual frames.In this paper, a frame-level attention module is integrated into an improved VGG-based frame work and a lightweight facial expression recognition method is proposed.The proposed network takes a sub video cut from an experimental video sequence as its input and generates a fixed-dimension representation.The VGG-based network with an enhanced branch embeds face images into feature vectors.The frame-level attention module learns weights which are used to adaptively aggregate the feature vectors to form a single discriminative video representation.Finally, a regression module outputs the classification results.The experimental results on CK+and AFEW databases show that the recognition rates of the proposed method can achieve the state-of-the-art performance.展开更多
Most present research into facial expression recognition focuses on the visible spectrum, which is sen- sitive to illumination change. In this paper, we focus on in- tegrating thermal infrared data with visible spectr...Most present research into facial expression recognition focuses on the visible spectrum, which is sen- sitive to illumination change. In this paper, we focus on in- tegrating thermal infrared data with visible spectrum images for spontaneous facial expression recognition. First, the ac- tive appearance model AAM parameters and three defined head motion features are extracted from visible spectrum im- ages, and several thermal statistical features are extracted from infrared (IR) images. Second, feature selection is per- formed using the F-test statistic. Third, Bayesian networks BNs and support vector machines SVMs are proposed for both decision-level and feature-level fusion. Experiments on the natural visible and infrared facial expression (NVIE) spontaneous database show the effectiveness of the proposed methods, and demonstrate thermal 1R images' supplementary role for visible facial expression recognition.展开更多
基金funded by Anhui Province Quality Engineering Project No.2021jyxm0801Natural Science Foundation of Anhui University of Chinese Medicine under Grant Nos.2020zrzd18,2019zrzd11+1 种基金Humanity Social Science foundation Grants 2021rwzd20,2020rwzd07Anhui University of Chinese Medicine Quality Engineering Projects No.2021zlgc046.
文摘For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The migration learning algorithm is used to pre-train the convolutional layer parameters and mitigate the overfitting caused by the insufficient number of training samples.The designed MCA module is integrated into the ResNet18 backbone network.The attention mechanism highlights important information and suppresses irrelevant information by assigning different coefficients or weights,and the multi-head structure focuses more on the local features of the pictures,which improves the efficiency of facial expression recognition.Experimental results demonstrate that the model proposed in this paper achieves excellent recognition results in Fer2013,CK+and Jaffe datasets,with accuracy rates of 72.7%,98.8%and 93.33%,respectively.
文摘In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.
基金This work is supported by Natural Science Foundation of China(Grant No.61903056)Major Project of Science and Technology Research Program of Chongqing Education Commission of China(Grant No.KJZDM201900601)+3 种基金Chongqing Research Program of Basic Research and Frontier Technology(Grant Nos.cstc2019jcyj-msxmX0681,cstc2021jcyj-msxmX0530,and cstc2021jcyjmsxmX0761)Project Supported by Chongqing Municipal Key Laboratory of Institutions of Higher Education(Grant No.cqupt-mct-201901)Project Supported by Chongqing Key Laboratory of Mobile Communications Technology(Grant No.cqupt-mct-202002)Project Supported by Engineering Research Center of Mobile Communications,Ministry of Education(Grant No.cqupt-mct202006)。
文摘The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.
文摘Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on hand-crafted features,namely,LBP,SIFT,and HOG,along with that a classifier trained on a database of videos or images.Many execute perform well on image datasets captured in a controlled condition;however not perform well in the more challenging dataset,which has partial faces and image variation.Recently,many studies presented an endwise structure for facial expression recognition by utilizing DL methods.Therefore,this study develops an earthworm optimization with an improved SqueezeNet-based FER(EWOISN-FER)model.The presented EWOISN-FER model primarily applies the contrast-limited adaptive histogram equalization(CLAHE)technique as a pre-processing step.In addition,the improved SqueezeNet model is exploited to derive an optimal set of feature vectors,and the hyperparameter tuning process is performed by the stochastic gradient boosting(SGB)model.Finally,EWO with sparse autoencoder(SAE)is employed for the FER process,and the EWO algorithm appropriately chooses the SAE parameters.Awide-ranging experimental analysis is carried out to examine the performance of the proposed model.The experimental outcomes indicate the supremacy of the presented EWOISN-FER technique.
文摘Facial Expression Recognition(FER)has been an importantfield of research for several decades.Extraction of emotional characteristics is crucial to FERs,but is complex to process as they have significant intra-class variances.Facial characteristics have not been completely explored in static pictures.Previous studies used Convolution Neural Networks(CNNs)based on transfer learning and hyperparameter optimizations for static facial emotional recognitions.Particle Swarm Optimizations(PSOs)have also been used for tuning hyperparameters.However,these methods achieve about 92 percent in terms of accuracy.The existing algorithms have issues with FER accuracy and precision.Hence,the overall FER performance is degraded significantly.To address this issue,this work proposes a combination of CNNs and Long Short-Term Memories(LSTMs)called the HCNN-LSTMs(Hybrid CNNs and LSTMs)approach for FERs.The work is evaluated on the benchmark dataset,Facial Expression Recog Image Ver(FERC).Viola-Jones(VJ)algorithms recognize faces from preprocessed images followed by HCNN-LSTMs feature extractions and FER classifications.Further,the success rate of Deep Learning Techniques(DLTs)has increased with hyperparameter tunings like epochs,batch sizes,initial learning rates,regularization parameters,shuffling types,and momentum.This proposed work uses Improved Weight based Whale Optimization Algorithms(IWWOAs)to select near-optimal settings for these parameters using bestfitness values.The experi-mentalfindings demonstrated that the proposed HCNN-LSTMs system outper-forms the existing methods.
文摘Analyzing human facial expressions using machine vision systems is indeed a challenging yet fascinating problem in the field of computer vision and artificial intelligence. Facial expressions are a primary means through which humans convey emotions, making their automated recognition valuable for various applications including man-computer interaction, affective computing, and psychological research. Pre-processing techniques are applied to every image with the aim of standardizing the images. Frequently used techniques include scaling, blurring, rotating, altering the contour of the image, changing the color to grayscale and normalization. Followed by feature extraction and then the traditional classifiers are applied to infer facial expressions. Increasing the performance of the system is difficult in the typical machine learning approach because feature extraction and classification phases are separate. But in Deep Neural Networks (DNN), the two phases are combined into a single phase. Therefore, the Convolutional Neural Network (CNN) models give better accuracy in Facial Expression Recognition than the traditional classifiers. But still the performance of CNN is hampered by noisy and deviated images in the dataset. This work utilized the preprocessing methods such as resizing, gray-scale conversion and normalization. Also, this research work is motivated by these drawbacks to study the use of image pre-processing techniques to enhance the performance of deep learning methods to implement facial expression recognition. Also, this research aims to recognize emotions using deep learning and show the influences of data pre-processing for further processing of images. The accuracy of each pre-processing methods is compared, then combination between them is analysed and the appropriate preprocessing techniques are identified and implemented to see the variability of accuracies in predicting facial expressions. .
基金The National Natural Science Foundation of China (No.60503023,60872160)the Natural Science Foundation for Universities ofJiangsu Province (No.08KJD520009)the Intramural Research Foundationof Nanjing University of Information Science and Technology(No.Y603)
文摘A novel fuzzy linear discriminant analysis method by the canonical correlation analysis (fuzzy-LDA/CCA)is presented and applied to the facial expression recognition. The fuzzy method is used to evaluate the degree of the class membership to which each training sample belongs. CCA is then used to establish the relationship between each facial image and the corresponding class membership vector, and the class membership vector of a test image is estimated using this relationship. Moreover, the fuzzy-LDA/CCA method is also generalized to deal with nonlinear discriminant analysis problems via kernel method. The performance of the proposed method is demonstrated using real data.
基金supported by the Academy of Finland(267581)the D2I SHOK Project from Digile Oy as well as Nokia Technologies(Tampere,Finland)
文摘Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a challenge, and inspired by the recent success of deep identity network (DeepID-Net) for face identification, this paper proposes a novel deep learning based framework for recognising human expressions with facial images. Compared to the existing deep learning methods, our proposed framework, which is based on multi-scale global images and local facial patches, can significantly achieve a better performance on facial expression recognition. Finally, we verify the effectiveness of our proposed framework through experiments on the public benchmarking datasets JAFFE and extended Cohn-Kanade (CK+).
基金supported by research grants from the National Natural Science Foundation of China (No. 81071099)the Liaoning Science and Technology Foundation (No. 2008225010-14)Doctoral Foundation of the First Affiliated Hospital in China Medical University (No. 2010)
文摘Functional magnetic resonance imaging was used during emotion recognition to identify changes in functional brain activation in 21 first-episode, treatment-naive major depressive disorder patients before and after antidepressant treatment. Following escitalopram oxalate treatment, patients exhibited decreased activation in bilateral precentral gyrus, bilateral middle frontal gyrus, left middle temporal gyrus, bilateral postcentral gyrus, left cingulate and right parahippocampal gyrus, and increased activation in right superior frontal gyrus, bilateral superior parietal Iobule and left occipital gyrus during sad facial expression recognition. After antidepressant treatment, patients also exhibited decreased activation in the bilateral middle frontal gyrus, bilateral cingulate and right parahippocampal gyrus, and increased activation in the right inferior frontal gyrus, left fusiform gyrus and right precuneus during happy facial expression recognition. Our experimental findings indicate that the limbic-cortical network might be a key target region for antidepressant treatment in major depressive disorder.
文摘In this paper, a novel method based on dual-tree complex wavelet transform(DT-CWT) and rotation invariant local binary pattern(LBP) for facial expression recognition is proposed. The quarter sample shift (Q-shift) DT-CWT can provide a group delay of 1/4 of a sample period, and satisfy the usual 2-band filter bank constraints of no aliasing and perfect reconstruction. To resolve illumination variation in expression verification, low-frequency coefficients produced by DT-CWT are set zeroes, high-frequency coefficients are used for reconstructing the image, and basic LBP histogram is mapped on the reconstructed image by means of histogram specification. LBP is capable of encoding texture and shape information of the preprocessed images. The histogram graphs built from multi-scale rotation invariant LBPs are combined to serve as feature for further recognition. Template matching is adopted to classify facial expressions for its simplicity. The experimental results show that the proposed approach has good performance in efficiency and accuracy.
文摘Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.
文摘Herein,a three-stage support vector machine(SVM)for facial expression recognition is proposed.The first stage comprises 21 SVMs,which are all the binary combinations of seven expressions.If one expression is dominant,then the first stage will suffice;if two are dominant,then the second stage is used;and,if three are dominant,the third stage is used.These multilevel stages help reduce the possibility of experiencing an error as much as possible.Different image preprocessing stages are used to ensure that the features attained from the face detected have a meaningful and proper contribution to the classification stage.Facial expressions are created as a result of muscle movements on the face.These subtle movements are detected by the histogram-oriented gradient feature,because it is sensitive to the shapes of objects.The features attained are then used to train the three-stage SVM.Two different validation methods were used:the leave-one-out and K-fold tests.Experimental results on three databases(Japanese Female Facial Expression,Extended Cohn-Kanade Dataset,and Radboud Faces Database)show that the proposed system is competitive and has better performance compared with other works.
基金supported by the Brain Korea 21 Project in2010,the MKE(The Ministry of Knowledge Economy),Koreathe ITRC(Information Technology Research Center)support programsupervised by the NIPA(National ITIndustry Promotion Agency)(NI-PA-2010-(C1090-1021-0010))
文摘The facial expression recognition systn using the Ariaboost based on the Split Rectangle feature is proposed in this paper. This system provides more various featmes in increasing speed and accuracy than the Haarolike featrue of Viola, which is commonly used for the Adaboost training algorithm. The Split Rectangle feature uses the nmsk-like shape composed with 2 independent rectangles, instead of using mask-like shape of Haar-like feature, which is composed of 2 --4 adhered rectangles of Viola. Split Rectangle feature has less di- verged operation than the Haar-like feaze. It also requires less oper- ation because the stun of pixels requires ordy two rectangles. Split Rectangle feature provides various and fast features to the Adaboost, which produrces the strong classifier with increased accuracy and speed. In the experiment, the system had 5.92 ms performance speed and 84 %--94 % accuracy by leaming 5 facial expressions, neutral, happiness, sadness, anger and surprise with the use of the Adaboost based on the Split Rectangle feature.
基金Supported by the National Natural Science Foundation of China(60772066)
文摘A new algorithm taking the spatial context of local features into account by utilizing contextualized histograms was proposed to recognize facial expression. The contextualized histograms were extracted fromtwo widely used descriptors—the local binary pattern( LBP) and weber local descriptor( WLD). The LBP and WLD feature histograms were extracted separately fromeach facial image,and contextualized histogram was generated as feature vectors to feed the classifier. In addition,the human face was divided into sub-blocks and each sub-block was assigned different weights by their different contributions to the intensity of facial expressions to improve the recognition rate. With the support vector machine(SVM) as classifier,the experimental results on the 2D texture images fromthe 3D-BU FE dataset indicated that contextualized histograms improved facial expression recognition performance when local features were employed.
文摘It is unknown if the ability of Portuguese in the identification of NimStim data set,which was created in America to provide facial expressions that could be recognized by untrained people,is(or not)similar to the Americans.To test this hypothesis the performance of Portuguese in the recognition of Happiness,Surprise,Sadness,Fear,Disgust and Anger NimStim facial expressions was compared with the Americans,but no significant differences were found.In both populations the easiest emotion to identify was Happiness while Fear was the most difficult one.However,with exception for Surprise,Portuguese tend to show a lower accuracy rate for all the emotions studied.Results highlighted some cultural differences.
基金supported by National Natural Science Foundation of China(No.61273339)
文摘In expression recognition, feature representation is critical for successful recognition since it contains distinctive information of expressions. In this paper, a new approach for representing facial expression features is proposed with its objective to describe features in an effective and efficient way in order to improve the recognition performance. The method combines the facial action coding system(FACS) and 'uniform' local binary patterns(LBP) to represent facial expression features from coarse to fine. The facial feature regions are extracted by active shape models(ASM) based on FACS to obtain the gray-level texture. Then, LBP is used to represent expression features for enhancing the discriminant. A facial expression recognition system is developed based on this feature extraction method by using K nearest neighborhood(K-NN) classifier to recognize facial expressions. Finally, experiments are carried out to evaluate this feature extraction method. The significance of removing the unrelated facial regions and enhancing the discrimination ability of expression features in the recognition process is indicated by the results, in addition to its convenience.
基金supported by the National Natural Science Foundation of China under Grant No. 60673190
文摘To overcome the disadvantage of classical recognition model that cannot perform well enough when there are some noises or lost frames in expression image sequences, a novel model called fuzzy buried Markov model (FBMM) is presented in this paper. FBMM relaxes conditional independence assumptions for classical hidden Markov model (HMM) by adding the specific cross-observation dependencies between observation elements. Compared with buried Markov model (BMM), FBMM utilizes cloud distribution to replace probability distribution to describe state transition and observation symbol generation and adopts maximum mutual information (MMI) method to replace maximum likelihood (ML) method to estimate parameters. Theoretical justifications and experimental results verify higher recognition rate and stronger robustness of facial expression recognition for image sequences based on FBMM than those of HMM and BMM.
文摘Automatic facial expression recognition (FER) from non-frontal views is a challenging research topic which has recently started to attract the attention of the research community. Pose variations are difficult to tackle and many face analysis methods require the use of sophisticated nor- malization and initialization procedures. Thus head-pose in- variant facial expression recognition continues to be an is- sue to traditional methods. In this paper, we propose a novel approach for pose-invariant FER based on pose-robust fea- tures which are learned by deep learning methods -- prin- cipal component analysis network (PCANet) and convolu- tional neural networks (CNN) (PRP-CNN). In the first stage, unlabeled frontal face images are used to learn features by PCANet. The features, in the second stage, are used as the tar- get of CNN to learn a feature mapping between frontal faces and non-frontal faces. We then describe the non-frontal face images using the novel descriptions generated by the maps, and get unified descriptors for arbitrary face images. Finally, the pose-robust features are used to train a single classifier for FER instead of training multiple models for each spe- cific pose. Our method, on the whole, does not require pose/ landmark annotation and can recognize facial expression in a wide range of orientations. Extensive experiments on two public databases show that our framework yields dramatic improvements in facial expression analysis.
基金Supported by the Future Network Scientific Research Fund Project of Jiangsu Province (No. FNSRFP2021YB26)the Jiangsu Key R&D Fund on Social Development (No. BE2022789)the Science Foundation of Nanjing Institute of Technology (No. ZKJ202003)。
文摘Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse facial features of individual frames.In this paper, a frame-level attention module is integrated into an improved VGG-based frame work and a lightweight facial expression recognition method is proposed.The proposed network takes a sub video cut from an experimental video sequence as its input and generates a fixed-dimension representation.The VGG-based network with an enhanced branch embeds face images into feature vectors.The frame-level attention module learns weights which are used to adaptively aggregate the feature vectors to form a single discriminative video representation.Finally, a regression module outputs the classification results.The experimental results on CK+and AFEW databases show that the recognition rates of the proposed method can achieve the state-of-the-art performance.
文摘Most present research into facial expression recognition focuses on the visible spectrum, which is sen- sitive to illumination change. In this paper, we focus on in- tegrating thermal infrared data with visible spectrum images for spontaneous facial expression recognition. First, the ac- tive appearance model AAM parameters and three defined head motion features are extracted from visible spectrum im- ages, and several thermal statistical features are extracted from infrared (IR) images. Second, feature selection is per- formed using the F-test statistic. Third, Bayesian networks BNs and support vector machines SVMs are proposed for both decision-level and feature-level fusion. Experiments on the natural visible and infrared facial expression (NVIE) spontaneous database show the effectiveness of the proposed methods, and demonstrate thermal 1R images' supplementary role for visible facial expression recognition.