Pulse rate is one of the important characteristics of traditional Chinese medicine pulse diagnosis,and it is of great significance for determining the nature of cold and heat in diseases.The prediction of pulse rate b...Pulse rate is one of the important characteristics of traditional Chinese medicine pulse diagnosis,and it is of great significance for determining the nature of cold and heat in diseases.The prediction of pulse rate based on facial video is an exciting research field for getting palpation information by observation diagnosis.However,most studies focus on optimizing the algorithm based on a small sample of participants without systematically investigating multiple influencing factors.A total of 209 participants and 2,435 facial videos,based on our self-constructed Multi-Scene Sign Dataset and the public datasets,were used to perform a multi-level and multi-factor comprehensive comparison.The effects of different datasets,blood volume pulse signal extraction algorithms,region of interests,time windows,color spaces,pulse rate calculation methods,and video recording scenes were analyzed.Furthermore,we proposed a blood volume pulse signal quality optimization strategy based on the inverse Fourier transform and an improvement strategy for pulse rate estimation based on signal-to-noise ratio threshold sliding.We found that the effects of video estimation of pulse rate in the Multi-Scene Sign Dataset and Pulse Rate Detection Dataset were better than in other datasets.Compared with Fast independent component analysis and Single Channel algorithms,chrominance-based method and plane-orthogonal-to-skin algorithms have a more vital anti-interference ability and higher robustness.The performances of the five-organs fusion area and the full-face area were better than that of single sub-regions,and the fewer motion artifacts and better lighting can improve the precision of pulse rate estimation.展开更多
Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on han...Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on handcrafted features and classification models trained on image or video datasets,recent strides in artificial intelligence and deep learning(DL)have ushered in more sophisticated approaches.The research aims to develop a FER system using a Faster Region Convolutional Neural Network(FRCNN)and design a specialized FRCNN architecture tailored for facial emotion recognition,leveraging its ability to capture spatial hierarchies within localized regions of facial features.The proposed work enhances the accuracy and efficiency of facial emotion recognition.The proposed work comprises twomajor key components:Inception V3-based feature extraction and FRCNN-based emotion categorization.Extensive experimentation on Kaggle datasets validates the effectiveness of the proposed strategy,showcasing the FRCNN approach’s resilience and accuracy in identifying and categorizing facial expressions.The model’s overall performance metrics are compelling,with an accuracy of 98.4%,precision of 97.2%,and recall of 96.31%.This work introduces a perceptive deep learning-based FER method,contributing to the evolving landscape of emotion recognition technologies.The high accuracy and resilience demonstrated by the FRCNN approach underscore its potential for real-world applications.This research advances the field of FER and presents a compelling case for the practicality and efficacy of deep learning models in automating the understanding of facial emotions.展开更多
Background: The ear and face are indispensable and distinctive features for hearing and identification. Objectives: This study was designed to generate anthropometric data of the ear and facial indices of females of E...Background: The ear and face are indispensable and distinctive features for hearing and identification. Objectives: This study was designed to generate anthropometric data of the ear and facial indices of females of Efik and Ibibio children in Cross River and Akwa Ibom States, show morphological and aesthetic differences and ethnicity. Methods: A total of 600 female children (300 Efiks and 300 Ibibios) aged 2 to 10 years that met the inclusion criteria were chosen from selected primary schools in Calabar Municipality, Calabar South of Cross River State and from Uyo, Itu of Akwa Ibom State, Nigeria. Standardized measurements of face length, face width, ear length, and ear width were taken with a spreading caliper;the facial (proscopic) and ear (auricular) indices were determined. Results: Efik subjects presented a mean face length of 8.36 ± 0.06 cm, face width of 11.04 ± 0.04 cm, ear length of 4.92 ± 0.02 cm, and ear width of 3.06 ± 0.01 cm. Ibibio subjects had mean values for face length, face width, ear length, and ear width as 8.17 ± 0.05 cm, 10.75 ± 0.05 cm, 4.77 ± 0.03 cm, and 2.94 ± 0.02 cm respectively. The mean facial index and ear index for Efik subjects were 75.68 ± 0.31 and 62.16 ± 0.27 respectively;while the mean facial and ear indices for Ibibio subjects were 74.79 ± 0.36 and 61.80 ± 0.34 respectively. Statistical analysis demonstrated significant differences in face length, ear length, ear width and facial index, with the Efik subjects having higher values than Ibibio subjects (p Conclusion: The results showed hypereuryproscopic face as the prevalent face type among females of both ethnic groups, therefore can be of importance in sex, ethnic, and racial differentiation, and in clinical practice, aesthetics and forensic medicine.展开更多
The estimation of pain intensity is critical for medical diagnosis and treatment of patients.With the development of image monitoring technology and artificial intelligence,automatic pain assessment based on facial ex...The estimation of pain intensity is critical for medical diagnosis and treatment of patients.With the development of image monitoring technology and artificial intelligence,automatic pain assessment based on facial expression and behavioral analysis shows a potential value in clinical applications.This paper reports a framework of convolutional neural network with global and local attention mechanism(GLA-CNN)for the effective detection of pain intensity at four-level thresholds using facial expression images.GLA-CNN includes two modules,namely global attention network(GANet)and local attention network(LANet).LANet is responsible for extracting representative local patch features of faces,while GANet extracts whole facial features to compensate for the ignored correlative features between patches.In the end,the global correlational and local subtle features are fused for the final estimation of pain intensity.Experiments under the UNBC-McMaster Shoulder Pain database demonstrate that GLA-CNN outperforms other state-of-the-art methods.Additionally,a visualization analysis is conducted to present the feature map of GLA-CNN,intuitively showing that it can extract not only local pain features but also global correlative facial ones.Our study demonstrates that pain assessment based on facial expression is a non-invasive and feasible method,and can be employed as an auxiliary pain assessment tool in clinical practice.展开更多
Background: Maxillofacial trauma affects young adults more. The injury assessment is difficult to establish in low-income countries because of the imaging means, particularly the scanner, which is poorly available and...Background: Maxillofacial trauma affects young adults more. The injury assessment is difficult to establish in low-income countries because of the imaging means, particularly the scanner, which is poorly available and less financially accessible. The aim of this study is to describe the epidemiological profile and the various tomodensitometric aspects of traumatic lesions of the face in patients received in the Radiology department of Kira Hospital. Patients and methods: This is a descriptive retrospective study involving 104 patients of all ages over a period of 2 years from December 2018 to November 2019 in the medical imaging department of KIRA HOSPITAL. We included in our study any patient having undergone a CT scan of the head and presenting at least one lesion of the facial mass, whether associated with other cranioencephalic lesions. Results: Among the 384 patients received for head trauma, 104 patients (27.1% of cases) presented facial damage. The average age of our patients was 32.02 years with extremes of 8 months and 79 years. In our study, 87 of the patients (83.6%) were male. The road accident was the circumstance in which facial trauma occurred in 79 patients (76% of cases). These injuries were accompanied by at least one bone fracture in 97 patients (93.3%). Patients with fractures of more than 3 facial bones accounted for 40.2% of cases and those with fractures of 2 to 3 bones accounted for 44.6% of cases. The midface was the site of the fracture in 85 patients (87.6% of cases). Orbital wall fractures were noted in 57 patients (58.8% of cases) and the jawbone was the site of a fracture in 50 patients (51.5% of cases). In the vault, the fractures involved the extra-facial frontal bone (36.1% of cases) and temporal bone (18.6% of cases). Cerebral contusion was noted in 41.2% of patients and pneumoencephaly in 15.5% of patients. Extradural hematoma was present in 16 patients and subdural hematoma affected 13 patients. Conclusion: Computed tomography is a diagnostic tool of choice in facial trauma patients. Most of these young patients present with multiple fractures localizing to the mid-level of the face with concomitant involvement of the brain.展开更多
Bipolar disorder is a serious mental condition that may be caused by any kind of stress or emotional upset experienced by the patient.It affects a large percentage of people globally,who fluctuate between depression a...Bipolar disorder is a serious mental condition that may be caused by any kind of stress or emotional upset experienced by the patient.It affects a large percentage of people globally,who fluctuate between depression and mania,or vice versa.A pleasant or unpleasant mood is more than a reflection of a state of mind.Normally,it is a difficult task to analyze through physical examination due to a large patient-psychiatrist ratio,so automated procedures are the best options to diagnose and verify the severity of bipolar.In this research work,facial microexpressions have been used for bipolar detection using the proposed Convolutional Neural Network(CNN)-based model.Facial Action Coding System(FACS)is used to extract micro-expressions called Action Units(AUs)connected with sad,happy,and angry emotions.Experiments have been conducted on a dataset collected from Bahawal Victoria Hospital,Bahawalpur,Pakistan,Using the Patient Health Questionnaire-15(PHQ-15)to infer a patient’s mental state.The experimental results showed a validation accuracy of 98.99%for the proposed CNN modelwhile classification through extracted featuresUsing SupportVectorMachines(SVM),K-NearestNeighbour(KNN),and Decision Tree(DT)obtained 99.9%,98.7%,and 98.9%accuracy,respectively.Overall,the outcomes demonstrated the stated method’s superiority over the current best practices.展开更多
Autism Spectrum Disorder(ASD)is a neurodevelopmental condition characterized by significant challenges in social interaction,communication,and repetitive behaviors.Timely and precise ASD detection is crucial,particula...Autism Spectrum Disorder(ASD)is a neurodevelopmental condition characterized by significant challenges in social interaction,communication,and repetitive behaviors.Timely and precise ASD detection is crucial,particularly in regions with limited diagnostic resources like Pakistan.This study aims to conduct an extensive comparative analysis of various machine learning classifiers for ASD detection using facial images to identify an accurate and cost-effective solution tailored to the local context.The research involves experimentation with VGG16 and MobileNet models,exploring different batch sizes,optimizers,and learning rate schedulers.In addition,the“Orange”machine learning tool is employed to evaluate classifier performance and automated image processing capabilities are utilized within the tool.The findings unequivocally establish VGG16 as the most effective classifier with a 5-fold cross-validation approach.Specifically,VGG16,with a batch size of 2 and the Adam optimizer,trained for 100 epochs,achieves a remarkable validation accuracy of 99% and a testing accuracy of 87%.Furthermore,the model achieves an F1 score of 88%,precision of 85%,and recall of 90% on test images.To validate the practical applicability of the VGG16 model with 5-fold cross-validation,the study conducts further testing on a dataset sourced fromautism centers in Pakistan,resulting in an accuracy rate of 85%.This reaffirms the model’s suitability for real-world ASD detection.This research offers valuable insights into classifier performance,emphasizing the potential of machine learning to deliver precise and accessible ASD diagnoses via facial image analysis.展开更多
BACKGROUND Facial herpes is a common form of the herpes simplex virus-1 infection and usually presents as vesicles near the mouth,nose,and periocular sites.In contrast,we observed a new facial symptom of herpes on the...BACKGROUND Facial herpes is a common form of the herpes simplex virus-1 infection and usually presents as vesicles near the mouth,nose,and periocular sites.In contrast,we observed a new facial symptom of herpes on the entire face without vesicles.CASE SUMMARY A 33-year-old woman with a history of varicella infection and shingles since an early age presented with sarcoidosis of the entire face and neuralgia without oral lesions.The patient was prescribed antiviral treatment with valacyclovir and acyclovir cream.One day after drug administration,facial skin lesions and neurological pain improved.Herpes simplex without oral blisters can easily be misdiagnosed as pimples upon visual examination in an outpatient clinic.CONCLUSION As acute herpes simplex is accompanied by neuralgia,prompt diagnosis and prescription are necessary,considering the pathological history and health conditions.展开更多
Introduction: Peripheral facial palsy (PFP) is a frequent reason for ENT consultations. It is a common complication of human immunodeficiency virus (HIV) infection. The aim of this study was to describe the diagnostic...Introduction: Peripheral facial palsy (PFP) is a frequent reason for ENT consultations. It is a common complication of human immunodeficiency virus (HIV) infection. The aim of this study was to describe the diagnostic and therapeutic aspects and to establish the correlation between PFP and HIV in our context. Patients and Method: This was a retrospective descriptive study conducted in the ENT and CFS department of the HIAOBO, covering the medical records of patients hospitalized for taking a PFP on HIV terrain from January 1, 2016 to December 31, 2020. Results: The study involved 17 patients, 10 men (59%) and 7 women (41%), a sex ratio of 1.4. The average age was 39 years with the extremes of 11 and 69 years. Shopkeepers reported 9 cases (53%). The reason for consultation was facial asymmetry in 11 cases (100%). The delay in consultation during the first week was 82.4%. Clinical signs were unilateral facial asymmetry, the opening of the palpebral fissure and lacrimation. All patients received medical treatment for PFP and HIV. Evolution was favorable, with complete recovery and no sequelae in 82.4% of cases. Surgery was performed in one case. Conclusion: PFPs are common in HIV infection. Diagnosis is clinical and management is multidisciplinary. Progression depends on the length of time taken to treat the disease.展开更多
To thoroughly understand market opportunity of freeze-dried facial mask and deeply get insight of consumers’usage behavior and needs,evaluate sensory feelings of 10 screened commercial freeze-dried facial mask produc...To thoroughly understand market opportunity of freeze-dried facial mask and deeply get insight of consumers’usage behavior and needs,evaluate sensory feelings of 10 screened commercial freeze-dried facial mask products,group test products according to the differences of sensory attributions via Principal Component Analysis(PCA)and Agglomerative Hierarchical Clustering(AHC),pick up the representative products.Freeze-dried facial mask users evaluate satisfaction degree of picked up products and participate survey of usage behavior/cognition.Analyze consumer data by AHC to get consumer segmentations and their profile.The test results show that,sensory data and consumer data,which is from consumers test of screened representative products by performing PCA and AHC on sensory data,can be verified mutually.It is helpful to understand the needs of consumer segmentations and reason to buy by combining sensory data and consumer test.展开更多
For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The mi...For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The migration learning algorithm is used to pre-train the convolutional layer parameters and mitigate the overfitting caused by the insufficient number of training samples.The designed MCA module is integrated into the ResNet18 backbone network.The attention mechanism highlights important information and suppresses irrelevant information by assigning different coefficients or weights,and the multi-head structure focuses more on the local features of the pictures,which improves the efficiency of facial expression recognition.Experimental results demonstrate that the model proposed in this paper achieves excellent recognition results in Fer2013,CK+and Jaffe datasets,with accuracy rates of 72.7%,98.8%and 93.33%,respectively.展开更多
Accurately recognizing facial expressions is essential for effective social interactions.Non-human primates(NHPs)are widely used in the study of the neural mechanisms underpinning facial expression processing,yet it r...Accurately recognizing facial expressions is essential for effective social interactions.Non-human primates(NHPs)are widely used in the study of the neural mechanisms underpinning facial expression processing,yet it remains unclear how well monkeys can recognize the facial expressions of other species such as humans.In this study,we systematically investigated how monkeys process the facial expressions of conspecifics and humans using eye-tracking technology and sophisticated behavioral tasks,namely the temporal discrimination task(TDT)and face scan task(FST).We found that monkeys showed prolonged subjective time perception in response to Negative facial expressions in monkeys while showing longer reaction time to Negative facial expressions in humans.Monkey faces also reliably induced divergent pupil contraction in response to different expressions,while human faces and scrambled monkey faces did not.Furthermore,viewing patterns in the FST indicated that monkeys only showed bias toward emotional expressions upon observing monkey faces.Finally,masking the eye region marginally decreased the viewing duration for monkey faces but not for human faces.By probing facial expression processing in monkeys,our study demonstrates that monkeys are more sensitive to the facial expressions of conspecifics than those of humans,thus shedding new light on inter-species communication through facial expressions between NHPs and humans.展开更多
Accurate localization of cranial nerves and responsible blood vessels is important for diagnosing trigeminal neuralgia(TN)and hemifacial spasm(HFS).Manual delineation of the nerves and vessels on medical images is tim...Accurate localization of cranial nerves and responsible blood vessels is important for diagnosing trigeminal neuralgia(TN)and hemifacial spasm(HFS).Manual delineation of the nerves and vessels on medical images is time-consuming and labor-intensive.Due to the development of convolutional neural networks(CNNs),the performance of medical image segmentation has been improved.In this work,we investigate the plans for automated segmentation of cranial nerves and responsible vessels for TN and HFS,which has not been comprehensively studied before.Different inputs are given to the CNN to find the best training configuration of segmenting trigeminal nerves,facial nerves,responsible vessels and brainstem,including the image modality and the number of segmentation targets.According to multiple experiments with seven training plans,we suggest training with the combination of three-dimensional fast imaging employing steady-state acquisition(3D-FIESTA)and three-dimensional time-of-flight magnetic resonance angiography(3DTOF-MRA),and separate segmentation of cranial nerves and vessels.展开更多
Objective This study aims to construct and validate a predictable deep learning model associated with clinical data and multi-sequence magnetic resonance imaging(MRI)for short-term postoperative facial nerve function ...Objective This study aims to construct and validate a predictable deep learning model associated with clinical data and multi-sequence magnetic resonance imaging(MRI)for short-term postoperative facial nerve function in patients with acoustic neuroma.Methods A total of 110 patients with acoustic neuroma who underwent surgery through the retrosigmoid sinus approach were included.Clinical data and raw features from four MRI sequences(T1-weighted,T2-weighted,T1-weighted contrast enhancement,and T2-weighted-Flair images)were analyzed.Spearman correlation analysis along with least absolute shrinkage and selection operator regression were used to screen combined clinical and radiomic features.Nomogram,machine learning,and convolutional neural network(CNN)models were constructed to predict the prognosis of facial nerve function on the seventh day after surgery.Receiver operating characteristic(ROC)curve and decision curve analysis(DCA)were used to evaluate model performance.A total of 1050 radiomic parameters were extracted,from which 13 radiomic and 3 clinical features were selected.Results The CNN model performed best among all prediction models in the test set with an area under the curve(AUC)of 0.89(95%CI,0.84–0.91).Conclusion CNN modeling that combines clinical and multi-sequence MRI radiomic features provides excellent performance for predicting short-term facial nerve function after surgery in patients with acoustic neuroma.As such,CNN modeling may serve as a potential decision-making tool for neurosurgery.展开更多
In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According t...In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.展开更多
Given the current expansion of the computer visionfield,several appli-cations that rely on extracting biometric information like facial gender for access control,security or marketing purposes are becoming more common....Given the current expansion of the computer visionfield,several appli-cations that rely on extracting biometric information like facial gender for access control,security or marketing purposes are becoming more common.A typical gender classifier requires many training samples to learn as many distinguishable features as possible.However,collecting facial images from individuals is usually a sensitive task,and it might violate either an individual's privacy or a specific data privacy law.In order to bridge the gap between privacy and the need for many facial images for deep learning training,an artificially generated dataset of facial images is proposed.We acquire a pre-trained Style-Generative Adversar-ial Networks(StyleGAN)generator and use it to create a dataset of facial images.We label the images according to the observed gender using a set of criteria that differentiate the facial features of males and females apart.We use this manually-labelled dataset to train three facial gender classifiers,a custom-designed network,and two pre-trained networks based on the Visual Geometry Group designs(VGG16)and(VGG19).We cross-validate these three classifiers on two separate datasets containing labelled images of actual subjects.For testing,we use the UTKFace and the Kaggle gender dataset.Our experimental results suggest that using a set of artificial images for training produces a comparable performance with accuracies similar to existing state-of-the-art methods,which uses actual images of individuals.The average classification accuracy of each classifier is between 94%and 95%,which is similar to existing proposed methods.展开更多
The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characte...The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.展开更多
Introduction: Maxillofacial ballistic trauma is a serious injury that is difficult to manage, with significant complications and after-effects. The authors report their experience in managing this type of trauma in th...Introduction: Maxillofacial ballistic trauma is a serious injury that is difficult to manage, with significant complications and after-effects. The authors report their experience in managing this type of trauma in the context of insecurity linked to terrorism. Patients and Methods: This was a descriptive cross-sectional study with retrospective data collection covering the period from January 1, 2018 to December 31, 2022 in the stomatology and maxillofacial surgery departments of the university hospitals of Ouagadougou. Results: In 5 years, 52 patients were collected, i.e. 10.4 cases per year. The mean age of the patients was 31.46 ± 15.41 years, and the sex ratio was 3. In 67.31% of patients, these injuries were the result of shootings during terrorist attacks. The jugal (36.54%) and chin (32.69%) regions were the most affected. The mandible (36.54%) and zygomatic bones (28.85%) were the most injured bones in these traumas. All patients underwent surgical treatment, and 25% suffered secondary complications. All patients retained at least one sequela. Conclusion: Maxillofacial injuries caused by ballistic trauma are true emergencies that can be life-threatening and functionally disabling. Their management is delicate and the outcome is uncertain, hence, the prevention is important.展开更多
The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully con...The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully connection.Such simple combination not only fails to pass the spatial information to network,but also increases calculation amounts.The method proposed in this paper aims to integrate facial landmarks-driven representation into the triplet network.The spatial information provided by landmarks is introduced into the feature extraction process,so that the model can better capture the location relationship.In addition,coordinate information is also integrated into the triple loss calculation to further enhance similarity prediction.Specifically,for each image,the coordinates of 68 landmarks are detected,and then a region attention map based on these landmarks is generated.For the feature map output by the shallow convolutional layer,it will be multiplied with the attention map to correct the feature activation,so as to strengthen the key region and weaken the unimportant region.Finally,the optimized embedding output can be further used for downstream tasks.Three embeddings of three images output by the network can be regarded as a triplet representation for similarity computation.Through the CK+dataset,the effectiveness of such an optimized feature extraction is verified.After that,it is applied to facial expression similarity tasks.The results on the facial expression comparison(FEC)dataset show that the accuracy rate will be significantly improved after the landmark information is introduced.展开更多
A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extr...A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extraction of more discriminative and distinctive deep learning features is achieved using extracted facial regions.To prevent overfitting,in-depth features of facial images are extracted and assigned to the proposed convolutional neural network(CNN)models.Various CNN models are then trained.Finally,the performance of each CNN model is fused to obtain the final decision for the seven basic classes of facial expressions,i.e.,fear,disgust,anger,surprise,sadness,happiness,neutral.For experimental purposes,three benchmark datasets,i.e.,SFEW,CK+,and KDEF are utilized.The performance of the proposed systemis compared with some state-of-the-artmethods concerning each dataset.Extensive performance analysis reveals that the proposed system outperforms the competitive methods in terms of various performance metrics.Finally,the proposed deep fusion model is being utilized to control a music player using the recognized emotions of the users.展开更多
基金supported by the Key Research Program of the Chinese Academy of Sciences(grant number ZDRW-ZS-2021-1-2).
文摘Pulse rate is one of the important characteristics of traditional Chinese medicine pulse diagnosis,and it is of great significance for determining the nature of cold and heat in diseases.The prediction of pulse rate based on facial video is an exciting research field for getting palpation information by observation diagnosis.However,most studies focus on optimizing the algorithm based on a small sample of participants without systematically investigating multiple influencing factors.A total of 209 participants and 2,435 facial videos,based on our self-constructed Multi-Scene Sign Dataset and the public datasets,were used to perform a multi-level and multi-factor comprehensive comparison.The effects of different datasets,blood volume pulse signal extraction algorithms,region of interests,time windows,color spaces,pulse rate calculation methods,and video recording scenes were analyzed.Furthermore,we proposed a blood volume pulse signal quality optimization strategy based on the inverse Fourier transform and an improvement strategy for pulse rate estimation based on signal-to-noise ratio threshold sliding.We found that the effects of video estimation of pulse rate in the Multi-Scene Sign Dataset and Pulse Rate Detection Dataset were better than in other datasets.Compared with Fast independent component analysis and Single Channel algorithms,chrominance-based method and plane-orthogonal-to-skin algorithms have a more vital anti-interference ability and higher robustness.The performances of the five-organs fusion area and the full-face area were better than that of single sub-regions,and the fewer motion artifacts and better lighting can improve the precision of pulse rate estimation.
文摘Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on handcrafted features and classification models trained on image or video datasets,recent strides in artificial intelligence and deep learning(DL)have ushered in more sophisticated approaches.The research aims to develop a FER system using a Faster Region Convolutional Neural Network(FRCNN)and design a specialized FRCNN architecture tailored for facial emotion recognition,leveraging its ability to capture spatial hierarchies within localized regions of facial features.The proposed work enhances the accuracy and efficiency of facial emotion recognition.The proposed work comprises twomajor key components:Inception V3-based feature extraction and FRCNN-based emotion categorization.Extensive experimentation on Kaggle datasets validates the effectiveness of the proposed strategy,showcasing the FRCNN approach’s resilience and accuracy in identifying and categorizing facial expressions.The model’s overall performance metrics are compelling,with an accuracy of 98.4%,precision of 97.2%,and recall of 96.31%.This work introduces a perceptive deep learning-based FER method,contributing to the evolving landscape of emotion recognition technologies.The high accuracy and resilience demonstrated by the FRCNN approach underscore its potential for real-world applications.This research advances the field of FER and presents a compelling case for the practicality and efficacy of deep learning models in automating the understanding of facial emotions.
文摘Background: The ear and face are indispensable and distinctive features for hearing and identification. Objectives: This study was designed to generate anthropometric data of the ear and facial indices of females of Efik and Ibibio children in Cross River and Akwa Ibom States, show morphological and aesthetic differences and ethnicity. Methods: A total of 600 female children (300 Efiks and 300 Ibibios) aged 2 to 10 years that met the inclusion criteria were chosen from selected primary schools in Calabar Municipality, Calabar South of Cross River State and from Uyo, Itu of Akwa Ibom State, Nigeria. Standardized measurements of face length, face width, ear length, and ear width were taken with a spreading caliper;the facial (proscopic) and ear (auricular) indices were determined. Results: Efik subjects presented a mean face length of 8.36 ± 0.06 cm, face width of 11.04 ± 0.04 cm, ear length of 4.92 ± 0.02 cm, and ear width of 3.06 ± 0.01 cm. Ibibio subjects had mean values for face length, face width, ear length, and ear width as 8.17 ± 0.05 cm, 10.75 ± 0.05 cm, 4.77 ± 0.03 cm, and 2.94 ± 0.02 cm respectively. The mean facial index and ear index for Efik subjects were 75.68 ± 0.31 and 62.16 ± 0.27 respectively;while the mean facial and ear indices for Ibibio subjects were 74.79 ± 0.36 and 61.80 ± 0.34 respectively. Statistical analysis demonstrated significant differences in face length, ear length, ear width and facial index, with the Efik subjects having higher values than Ibibio subjects (p Conclusion: The results showed hypereuryproscopic face as the prevalent face type among females of both ethnic groups, therefore can be of importance in sex, ethnic, and racial differentiation, and in clinical practice, aesthetics and forensic medicine.
基金supported by the National Natural Science Foundation of China under Grant No.62276051the Natural Science Foundation of Sichuan Province under Grant No.2023NSFSC0640Medical Industry Information Integration Collaborative Innovation Project of Yangtze Delta Region Institute under Grant No.U0723002。
文摘The estimation of pain intensity is critical for medical diagnosis and treatment of patients.With the development of image monitoring technology and artificial intelligence,automatic pain assessment based on facial expression and behavioral analysis shows a potential value in clinical applications.This paper reports a framework of convolutional neural network with global and local attention mechanism(GLA-CNN)for the effective detection of pain intensity at four-level thresholds using facial expression images.GLA-CNN includes two modules,namely global attention network(GANet)and local attention network(LANet).LANet is responsible for extracting representative local patch features of faces,while GANet extracts whole facial features to compensate for the ignored correlative features between patches.In the end,the global correlational and local subtle features are fused for the final estimation of pain intensity.Experiments under the UNBC-McMaster Shoulder Pain database demonstrate that GLA-CNN outperforms other state-of-the-art methods.Additionally,a visualization analysis is conducted to present the feature map of GLA-CNN,intuitively showing that it can extract not only local pain features but also global correlative facial ones.Our study demonstrates that pain assessment based on facial expression is a non-invasive and feasible method,and can be employed as an auxiliary pain assessment tool in clinical practice.
文摘Background: Maxillofacial trauma affects young adults more. The injury assessment is difficult to establish in low-income countries because of the imaging means, particularly the scanner, which is poorly available and less financially accessible. The aim of this study is to describe the epidemiological profile and the various tomodensitometric aspects of traumatic lesions of the face in patients received in the Radiology department of Kira Hospital. Patients and methods: This is a descriptive retrospective study involving 104 patients of all ages over a period of 2 years from December 2018 to November 2019 in the medical imaging department of KIRA HOSPITAL. We included in our study any patient having undergone a CT scan of the head and presenting at least one lesion of the facial mass, whether associated with other cranioencephalic lesions. Results: Among the 384 patients received for head trauma, 104 patients (27.1% of cases) presented facial damage. The average age of our patients was 32.02 years with extremes of 8 months and 79 years. In our study, 87 of the patients (83.6%) were male. The road accident was the circumstance in which facial trauma occurred in 79 patients (76% of cases). These injuries were accompanied by at least one bone fracture in 97 patients (93.3%). Patients with fractures of more than 3 facial bones accounted for 40.2% of cases and those with fractures of 2 to 3 bones accounted for 44.6% of cases. The midface was the site of the fracture in 85 patients (87.6% of cases). Orbital wall fractures were noted in 57 patients (58.8% of cases) and the jawbone was the site of a fracture in 50 patients (51.5% of cases). In the vault, the fractures involved the extra-facial frontal bone (36.1% of cases) and temporal bone (18.6% of cases). Cerebral contusion was noted in 41.2% of patients and pneumoencephaly in 15.5% of patients. Extradural hematoma was present in 16 patients and subdural hematoma affected 13 patients. Conclusion: Computed tomography is a diagnostic tool of choice in facial trauma patients. Most of these young patients present with multiple fractures localizing to the mid-level of the face with concomitant involvement of the brain.
文摘Bipolar disorder is a serious mental condition that may be caused by any kind of stress or emotional upset experienced by the patient.It affects a large percentage of people globally,who fluctuate between depression and mania,or vice versa.A pleasant or unpleasant mood is more than a reflection of a state of mind.Normally,it is a difficult task to analyze through physical examination due to a large patient-psychiatrist ratio,so automated procedures are the best options to diagnose and verify the severity of bipolar.In this research work,facial microexpressions have been used for bipolar detection using the proposed Convolutional Neural Network(CNN)-based model.Facial Action Coding System(FACS)is used to extract micro-expressions called Action Units(AUs)connected with sad,happy,and angry emotions.Experiments have been conducted on a dataset collected from Bahawal Victoria Hospital,Bahawalpur,Pakistan,Using the Patient Health Questionnaire-15(PHQ-15)to infer a patient’s mental state.The experimental results showed a validation accuracy of 98.99%for the proposed CNN modelwhile classification through extracted featuresUsing SupportVectorMachines(SVM),K-NearestNeighbour(KNN),and Decision Tree(DT)obtained 99.9%,98.7%,and 98.9%accuracy,respectively.Overall,the outcomes demonstrated the stated method’s superiority over the current best practices.
文摘Autism Spectrum Disorder(ASD)is a neurodevelopmental condition characterized by significant challenges in social interaction,communication,and repetitive behaviors.Timely and precise ASD detection is crucial,particularly in regions with limited diagnostic resources like Pakistan.This study aims to conduct an extensive comparative analysis of various machine learning classifiers for ASD detection using facial images to identify an accurate and cost-effective solution tailored to the local context.The research involves experimentation with VGG16 and MobileNet models,exploring different batch sizes,optimizers,and learning rate schedulers.In addition,the“Orange”machine learning tool is employed to evaluate classifier performance and automated image processing capabilities are utilized within the tool.The findings unequivocally establish VGG16 as the most effective classifier with a 5-fold cross-validation approach.Specifically,VGG16,with a batch size of 2 and the Adam optimizer,trained for 100 epochs,achieves a remarkable validation accuracy of 99% and a testing accuracy of 87%.Furthermore,the model achieves an F1 score of 88%,precision of 85%,and recall of 90% on test images.To validate the practical applicability of the VGG16 model with 5-fold cross-validation,the study conducts further testing on a dataset sourced fromautism centers in Pakistan,resulting in an accuracy rate of 85%.This reaffirms the model’s suitability for real-world ASD detection.This research offers valuable insights into classifier performance,emphasizing the potential of machine learning to deliver precise and accessible ASD diagnoses via facial image analysis.
文摘BACKGROUND Facial herpes is a common form of the herpes simplex virus-1 infection and usually presents as vesicles near the mouth,nose,and periocular sites.In contrast,we observed a new facial symptom of herpes on the entire face without vesicles.CASE SUMMARY A 33-year-old woman with a history of varicella infection and shingles since an early age presented with sarcoidosis of the entire face and neuralgia without oral lesions.The patient was prescribed antiviral treatment with valacyclovir and acyclovir cream.One day after drug administration,facial skin lesions and neurological pain improved.Herpes simplex without oral blisters can easily be misdiagnosed as pimples upon visual examination in an outpatient clinic.CONCLUSION As acute herpes simplex is accompanied by neuralgia,prompt diagnosis and prescription are necessary,considering the pathological history and health conditions.
文摘Introduction: Peripheral facial palsy (PFP) is a frequent reason for ENT consultations. It is a common complication of human immunodeficiency virus (HIV) infection. The aim of this study was to describe the diagnostic and therapeutic aspects and to establish the correlation between PFP and HIV in our context. Patients and Method: This was a retrospective descriptive study conducted in the ENT and CFS department of the HIAOBO, covering the medical records of patients hospitalized for taking a PFP on HIV terrain from January 1, 2016 to December 31, 2020. Results: The study involved 17 patients, 10 men (59%) and 7 women (41%), a sex ratio of 1.4. The average age was 39 years with the extremes of 11 and 69 years. Shopkeepers reported 9 cases (53%). The reason for consultation was facial asymmetry in 11 cases (100%). The delay in consultation during the first week was 82.4%. Clinical signs were unilateral facial asymmetry, the opening of the palpebral fissure and lacrimation. All patients received medical treatment for PFP and HIV. Evolution was favorable, with complete recovery and no sequelae in 82.4% of cases. Surgery was performed in one case. Conclusion: PFPs are common in HIV infection. Diagnosis is clinical and management is multidisciplinary. Progression depends on the length of time taken to treat the disease.
文摘To thoroughly understand market opportunity of freeze-dried facial mask and deeply get insight of consumers’usage behavior and needs,evaluate sensory feelings of 10 screened commercial freeze-dried facial mask products,group test products according to the differences of sensory attributions via Principal Component Analysis(PCA)and Agglomerative Hierarchical Clustering(AHC),pick up the representative products.Freeze-dried facial mask users evaluate satisfaction degree of picked up products and participate survey of usage behavior/cognition.Analyze consumer data by AHC to get consumer segmentations and their profile.The test results show that,sensory data and consumer data,which is from consumers test of screened representative products by performing PCA and AHC on sensory data,can be verified mutually.It is helpful to understand the needs of consumer segmentations and reason to buy by combining sensory data and consumer test.
基金funded by Anhui Province Quality Engineering Project No.2021jyxm0801Natural Science Foundation of Anhui University of Chinese Medicine under Grant Nos.2020zrzd18,2019zrzd11+1 种基金Humanity Social Science foundation Grants 2021rwzd20,2020rwzd07Anhui University of Chinese Medicine Quality Engineering Projects No.2021zlgc046.
文摘For the problems of complex model structure and too many training parameters in facial expression recognition algorithms,we proposed a residual network structure with a multi-headed channel attention(MCA)module.The migration learning algorithm is used to pre-train the convolutional layer parameters and mitigate the overfitting caused by the insufficient number of training samples.The designed MCA module is integrated into the ResNet18 backbone network.The attention mechanism highlights important information and suppresses irrelevant information by assigning different coefficients or weights,and the multi-head structure focuses more on the local features of the pictures,which improves the efficiency of facial expression recognition.Experimental results demonstrate that the model proposed in this paper achieves excellent recognition results in Fer2013,CK+and Jaffe datasets,with accuracy rates of 72.7%,98.8%and 93.33%,respectively.
基金supported by the National Natural Science Foundation of China (U20A2017)Guangdong Basic and Applied Basic Research Foundation (2022A1515010134,2022A1515110598)+2 种基金Youth Innovation Promotion Association of Chinese Academy of Sciences (2017120)Shenzhen-Hong Kong Institute of Brain Science–Shenzhen Fundamental Research Institutions (NYKFKT2019009)Shenzhen Technological Research Center for Primate Translational Medicine (F-2021-Z99-504979)。
文摘Accurately recognizing facial expressions is essential for effective social interactions.Non-human primates(NHPs)are widely used in the study of the neural mechanisms underpinning facial expression processing,yet it remains unclear how well monkeys can recognize the facial expressions of other species such as humans.In this study,we systematically investigated how monkeys process the facial expressions of conspecifics and humans using eye-tracking technology and sophisticated behavioral tasks,namely the temporal discrimination task(TDT)and face scan task(FST).We found that monkeys showed prolonged subjective time perception in response to Negative facial expressions in monkeys while showing longer reaction time to Negative facial expressions in humans.Monkey faces also reliably induced divergent pupil contraction in response to different expressions,while human faces and scrambled monkey faces did not.Furthermore,viewing patterns in the FST indicated that monkeys only showed bias toward emotional expressions upon observing monkey faces.Finally,masking the eye region marginally decreased the viewing duration for monkey faces but not for human faces.By probing facial expression processing in monkeys,our study demonstrates that monkeys are more sensitive to the facial expressions of conspecifics than those of humans,thus shedding new light on inter-species communication through facial expressions between NHPs and humans.
文摘Accurate localization of cranial nerves and responsible blood vessels is important for diagnosing trigeminal neuralgia(TN)and hemifacial spasm(HFS).Manual delineation of the nerves and vessels on medical images is time-consuming and labor-intensive.Due to the development of convolutional neural networks(CNNs),the performance of medical image segmentation has been improved.In this work,we investigate the plans for automated segmentation of cranial nerves and responsible vessels for TN and HFS,which has not been comprehensively studied before.Different inputs are given to the CNN to find the best training configuration of segmenting trigeminal nerves,facial nerves,responsible vessels and brainstem,including the image modality and the number of segmentation targets.According to multiple experiments with seven training plans,we suggest training with the combination of three-dimensional fast imaging employing steady-state acquisition(3D-FIESTA)and three-dimensional time-of-flight magnetic resonance angiography(3DTOF-MRA),and separate segmentation of cranial nerves and vessels.
文摘Objective This study aims to construct and validate a predictable deep learning model associated with clinical data and multi-sequence magnetic resonance imaging(MRI)for short-term postoperative facial nerve function in patients with acoustic neuroma.Methods A total of 110 patients with acoustic neuroma who underwent surgery through the retrosigmoid sinus approach were included.Clinical data and raw features from four MRI sequences(T1-weighted,T2-weighted,T1-weighted contrast enhancement,and T2-weighted-Flair images)were analyzed.Spearman correlation analysis along with least absolute shrinkage and selection operator regression were used to screen combined clinical and radiomic features.Nomogram,machine learning,and convolutional neural network(CNN)models were constructed to predict the prognosis of facial nerve function on the seventh day after surgery.Receiver operating characteristic(ROC)curve and decision curve analysis(DCA)were used to evaluate model performance.A total of 1050 radiomic parameters were extracted,from which 13 radiomic and 3 clinical features were selected.Results The CNN model performed best among all prediction models in the test set with an area under the curve(AUC)of 0.89(95%CI,0.84–0.91).Conclusion CNN modeling that combines clinical and multi-sequence MRI radiomic features provides excellent performance for predicting short-term facial nerve function after surgery in patients with acoustic neuroma.As such,CNN modeling may serve as a potential decision-making tool for neurosurgery.
文摘In computer vision,emotion recognition using facial expression images is considered an important research issue.Deep learning advances in recent years have aided in attaining improved results in this issue.According to recent studies,multiple facial expressions may be included in facial photographs representing a particular type of emotion.It is feasible and useful to convert face photos into collections of visual words and carry out global expression recognition.The main contribution of this paper is to propose a facial expression recognitionmodel(FERM)depending on an optimized Support Vector Machine(SVM).To test the performance of the proposed model(FERM),AffectNet is used.AffectNet uses 1250 emotion-related keywords in six different languages to search three major search engines and get over 1,000,000 facial photos online.The FERM is composed of three main phases:(i)the Data preparation phase,(ii)Applying grid search for optimization,and(iii)the categorization phase.Linear discriminant analysis(LDA)is used to categorize the data into eight labels(neutral,happy,sad,surprised,fear,disgust,angry,and contempt).Due to using LDA,the performance of categorization via SVM has been obviously enhanced.Grid search is used to find the optimal values for hyperparameters of SVM(C and gamma).The proposed optimized SVM algorithm has achieved an accuracy of 99%and a 98%F1 score.
文摘Given the current expansion of the computer visionfield,several appli-cations that rely on extracting biometric information like facial gender for access control,security or marketing purposes are becoming more common.A typical gender classifier requires many training samples to learn as many distinguishable features as possible.However,collecting facial images from individuals is usually a sensitive task,and it might violate either an individual's privacy or a specific data privacy law.In order to bridge the gap between privacy and the need for many facial images for deep learning training,an artificially generated dataset of facial images is proposed.We acquire a pre-trained Style-Generative Adversar-ial Networks(StyleGAN)generator and use it to create a dataset of facial images.We label the images according to the observed gender using a set of criteria that differentiate the facial features of males and females apart.We use this manually-labelled dataset to train three facial gender classifiers,a custom-designed network,and two pre-trained networks based on the Visual Geometry Group designs(VGG16)and(VGG19).We cross-validate these three classifiers on two separate datasets containing labelled images of actual subjects.For testing,we use the UTKFace and the Kaggle gender dataset.Our experimental results suggest that using a set of artificial images for training produces a comparable performance with accuracies similar to existing state-of-the-art methods,which uses actual images of individuals.The average classification accuracy of each classifier is between 94%and 95%,which is similar to existing proposed methods.
基金This work is supported by Natural Science Foundation of China(Grant No.61903056)Major Project of Science and Technology Research Program of Chongqing Education Commission of China(Grant No.KJZDM201900601)+3 种基金Chongqing Research Program of Basic Research and Frontier Technology(Grant Nos.cstc2019jcyj-msxmX0681,cstc2021jcyj-msxmX0530,and cstc2021jcyjmsxmX0761)Project Supported by Chongqing Municipal Key Laboratory of Institutions of Higher Education(Grant No.cqupt-mct-201901)Project Supported by Chongqing Key Laboratory of Mobile Communications Technology(Grant No.cqupt-mct-202002)Project Supported by Engineering Research Center of Mobile Communications,Ministry of Education(Grant No.cqupt-mct202006)。
文摘The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method.
文摘Introduction: Maxillofacial ballistic trauma is a serious injury that is difficult to manage, with significant complications and after-effects. The authors report their experience in managing this type of trauma in the context of insecurity linked to terrorism. Patients and Methods: This was a descriptive cross-sectional study with retrospective data collection covering the period from January 1, 2018 to December 31, 2022 in the stomatology and maxillofacial surgery departments of the university hospitals of Ouagadougou. Results: In 5 years, 52 patients were collected, i.e. 10.4 cases per year. The mean age of the patients was 31.46 ± 15.41 years, and the sex ratio was 3. In 67.31% of patients, these injuries were the result of shootings during terrorist attacks. The jugal (36.54%) and chin (32.69%) regions were the most affected. The mandible (36.54%) and zygomatic bones (28.85%) were the most injured bones in these traumas. All patients underwent surgical treatment, and 25% suffered secondary complications. All patients retained at least one sequela. Conclusion: Maxillofacial injuries caused by ballistic trauma are true emergencies that can be life-threatening and functionally disabling. Their management is delicate and the outcome is uncertain, hence, the prevention is important.
文摘The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully connection.Such simple combination not only fails to pass the spatial information to network,but also increases calculation amounts.The method proposed in this paper aims to integrate facial landmarks-driven representation into the triplet network.The spatial information provided by landmarks is introduced into the feature extraction process,so that the model can better capture the location relationship.In addition,coordinate information is also integrated into the triple loss calculation to further enhance similarity prediction.Specifically,for each image,the coordinates of 68 landmarks are detected,and then a region attention map based on these landmarks is generated.For the feature map output by the shallow convolutional layer,it will be multiplied with the attention map to correct the feature activation,so as to strengthen the key region and weaken the unimportant region.Finally,the optimized embedding output can be further used for downstream tasks.Three embeddings of three images output by the network can be regarded as a triplet representation for similarity computation.Through the CK+dataset,the effectiveness of such an optimized feature extraction is verified.After that,it is applied to facial expression similarity tasks.The results on the facial expression comparison(FEC)dataset show that the accuracy rate will be significantly improved after the landmark information is introduced.
基金supported by the Researchers Supporting Project (No.RSP-2021/395),King Saud University,Riyadh,Saudi Arabia.
文摘A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extraction of more discriminative and distinctive deep learning features is achieved using extracted facial regions.To prevent overfitting,in-depth features of facial images are extracted and assigned to the proposed convolutional neural network(CNN)models.Various CNN models are then trained.Finally,the performance of each CNN model is fused to obtain the final decision for the seven basic classes of facial expressions,i.e.,fear,disgust,anger,surprise,sadness,happiness,neutral.For experimental purposes,three benchmark datasets,i.e.,SFEW,CK+,and KDEF are utilized.The performance of the proposed systemis compared with some state-of-the-artmethods concerning each dataset.Extensive performance analysis reveals that the proposed system outperforms the competitive methods in terms of various performance metrics.Finally,the proposed deep fusion model is being utilized to control a music player using the recognized emotions of the users.