Canonical correlation analysis ( CCA ) based methods for low-resolution ( LR ) face recognition involve face images with different resolutions ( or multi-resolutions ), i.e.LR and high-resolution ( HR ) .For single-re...Canonical correlation analysis ( CCA ) based methods for low-resolution ( LR ) face recognition involve face images with different resolutions ( or multi-resolutions ), i.e.LR and high-resolution ( HR ) .For single-resolution face recognition , researchers have shown that utilizing spatial information is beneficial to improving the recognition accuracy , mainly because the pixels of each face are not independent but spatially correlated.However , for a multi-resolution scenario , there are no related works.Therefore , a method named spatial regularization of canonical correlation analysis ( SRCCA ) is developed for LR face recognition to improve the performance of CCA by the regularization utilizing spatial information of different resolution faces.Furthermore , the impact of LR and HR spatial regularization terms on LR face recognition is analyzed through experiments.展开更多
Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in ...Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in sparse representation means that only a few of instances selected from all training samples can effectively convey the essential class-specific information of the test sample,which is very important for classification.For deformable images such as human faces,pixels at the same location of different images of the same subject usually have different intensities.Therefore,extracting features and correctly classifying such deformable objects is very hard.Moreover,the lighting,attitude and occlusion cause more difficulty.Considering the problems and challenges listed above,a novel image representation and classification algorithm is proposed.First,the authors’algorithm generates virtual samples by a non-linear variation method.This method can effectively extract the low-frequency information of space-domain features of the original image,which is very useful for representing deformable objects.The combination of the original and virtual samples is more beneficial to improve the clas-sification performance and robustness of the algorithm.Thereby,the authors’algorithm calculates the expression coefficients of the original and virtual samples separately using the sparse representation principle and obtains the final score by a designed efficient score fusion scheme.The weighting coefficients in the score fusion scheme are set entirely automatically.Finally,the algorithm classifies the samples based on the final scores.The experimental results show that our method performs better classification than conventional sparse representation algorithms.展开更多
Identifying faces in non-frontal poses presents a significant challenge for face recognition(FR)systems.In this study,we delved into the impact of yaw pose variations on these systems and devised a robust method for d...Identifying faces in non-frontal poses presents a significant challenge for face recognition(FR)systems.In this study,we delved into the impact of yaw pose variations on these systems and devised a robust method for detecting faces across a wide range of angles from 0°to±90°.We initially selected the most suitable feature vector size by integrating the Dlib,FaceNet(Inception-v2),and“Support Vector Machines(SVM)”+“K-nearest neighbors(KNN)”algorithms.To train and evaluate this feature vector,we used two datasets:the“Labeled Faces in the Wild(LFW)”benchmark data and the“Robust Shape-Based FR System(RSBFRS)”real-time data,which contained face images with varying yaw poses.After selecting the best feature vector,we developed a real-time FR system to handle yaw poses.The proposed FaceNet architecture achieved recognition accuracies of 99.7%and 99.8%for the LFW and RSBFRS datasets,respectively,with 128 feature vector dimensions and minimum Euclidean distance thresholds of 0.06 and 0.12.The FaceNet+SVM and FaceNet+KNN classifiers achieved classification accuracies of 99.26%and 99.44%,respectively.The 128-dimensional embedding vector showed the highest recognition rate among all dimensions.These results demonstrate the effectiveness of our proposed approach in enhancing FR accuracy,particularly in real-world scenarios with varying yaw poses.展开更多
Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on han...Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on handcrafted features and classification models trained on image or video datasets,recent strides in artificial intelligence and deep learning(DL)have ushered in more sophisticated approaches.The research aims to develop a FER system using a Faster Region Convolutional Neural Network(FRCNN)and design a specialized FRCNN architecture tailored for facial emotion recognition,leveraging its ability to capture spatial hierarchies within localized regions of facial features.The proposed work enhances the accuracy and efficiency of facial emotion recognition.The proposed work comprises twomajor key components:Inception V3-based feature extraction and FRCNN-based emotion categorization.Extensive experimentation on Kaggle datasets validates the effectiveness of the proposed strategy,showcasing the FRCNN approach’s resilience and accuracy in identifying and categorizing facial expressions.The model’s overall performance metrics are compelling,with an accuracy of 98.4%,precision of 97.2%,and recall of 96.31%.This work introduces a perceptive deep learning-based FER method,contributing to the evolving landscape of emotion recognition technologies.The high accuracy and resilience demonstrated by the FRCNN approach underscore its potential for real-world applications.This research advances the field of FER and presents a compelling case for the practicality and efficacy of deep learning models in automating the understanding of facial emotions.展开更多
Face recognition (FR) technology has numerous applications in artificial intelligence including biometrics, security,authentication, law enforcement, and surveillance. Deep learning (DL) models, notably convolutional ...Face recognition (FR) technology has numerous applications in artificial intelligence including biometrics, security,authentication, law enforcement, and surveillance. Deep learning (DL) models, notably convolutional neuralnetworks (CNNs), have shown promising results in the field of FR. However CNNs are easily fooled since theydo not encode position and orientation correlations between features. Hinton et al. envisioned Capsule Networksas a more robust design capable of retaining pose information and spatial correlations to recognize objects morelike the brain does. Lower-level capsules hold 8-dimensional vectors of attributes like position, hue, texture, andso on, which are routed to higher-level capsules via a new routing by agreement algorithm. This provides capsulenetworks with viewpoint invariance, which has previously evaded CNNs. This research presents a FR model basedon capsule networks that was tested using the LFW dataset, COMSATS face dataset, and own acquired photos usingcameras measuring 128 × 128 pixels, 40 × 40 pixels, and 30 × 30 pixels. The trained model outperforms state-ofthe-art algorithms, achieving 95.82% test accuracy and performing well on unseen faces that have been blurred orrotated. Additionally, the suggested model outperformed the recently released approaches on the COMSATS facedataset, achieving a high accuracy of 92.47%. Based on the results of this research as well as previous results, capsulenetworks perform better than deeper CNNs on unobserved altered data because of their special equivarianceproperties.展开更多
Corona virus(COVID-19)is once in a life time calamity that has resulted in thousands of deaths and security concerns.People are using face masks on a regular basis to protect themselves and to help reduce corona virus...Corona virus(COVID-19)is once in a life time calamity that has resulted in thousands of deaths and security concerns.People are using face masks on a regular basis to protect themselves and to help reduce corona virus transmission.During the on-going coronavirus outbreak,one of the major priorities for researchers is to discover effective solution.As important parts of the face are obscured,face identification and verification becomes exceedingly difficult.The suggested method is a transfer learning using MobileNet V2 based technology that uses deep feature such as feature extraction and deep learning model,to identify the problem of face masked identification.In the first stage,we are applying face mask detector to identify the face mask.Then,the proposed approach is applying to the datasets from Canadian Institute for Advanced Research10(CIFAR10),Modified National Institute of Standards and Technology Database(MNIST),Real World Masked Face Recognition Database(RMFRD),and Stimulated Masked Face Recognition Database(SMFRD).The proposed model is achieving recognition accuracy 99.82%with proposed dataset.This article employs the four pre-programmed models VGG16,VGG19,ResNet50 and ResNet101.To extract the deep features of faces with VGG16 is achieving 99.30%accuracy,VGG19 is achieving 99.54%accuracy,ResNet50 is achieving 78.70%accuracy and ResNet101 is achieving 98.64%accuracy with own dataset.The comparative analysis shows,that our proposed model performs better result in all four previous existing models.The fundamental contribution of this study is to monitor with face mask and without face mask to decreases the pace of corona virus and to detect persons using wearing face masks.展开更多
Convolutional neural networks continually evolve to enhance accuracy in addressing various problems,leading to an increase in computational cost and model size.This paper introduces a novel approach for pruning face r...Convolutional neural networks continually evolve to enhance accuracy in addressing various problems,leading to an increase in computational cost and model size.This paper introduces a novel approach for pruning face recognition models based on convolutional neural networks.The proposed method identifies and removes inefficient filters based on the information volume in feature maps.In each layer,some feature maps lack useful information,and there exists a correlation between certain feature maps.Filters associated with these two types of feature maps impose additional computational costs on the model.By eliminating filters related to these categories of feature maps,the reduction of both computational cost and model size can be achieved.The approach employs a combination of correlation analysis and the summation of matrix elements within each feature map to detect and eliminate inefficient filters.The method was applied to two face recognition models utilizing the VGG16 and ResNet50V2 backbone architectures.In the proposed approach,the number of filters removed in each layer varies,and the removal process is independent of the adjacent layers.The convolutional layers of both backbone models were initialized with pre-trained weights from ImageNet.For training,the CASIA-WebFace dataset was utilized,and the Labeled Faces in the Wild(LFW)dataset was employed for benchmarking purposes.In the VGG16-based face recognition model,a 0.74%accuracy improvement was achieved while reducing the number of convolution parameters by 26.85%and decreasing Floating-point operations per second(FLOPs)by 47.96%.For the face recognition model based on the ResNet50V2 architecture,the ArcFace method was implemented.The removal of inactive filters in this model led to a slight decrease in accuracy by 0.11%.However,it resulted in enhanced training speed,a reduction of 59.38%in convolution parameters,and a 57.29%decrease in FLOPs.展开更多
A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techni...A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techniques. The algorithm is able to track the human face against a complex background and also works well when temporary occlusion occurs. We also obtain a very high recognition rate by averaging a number of samples over a long image sequence. The proposed approach has been successfully tested by many experiments, and can operate at 20 frames/s on an 800 MHz PC.展开更多
In principal component analysis (PCA) algorithms for face recognition, to reduce the influence of the eigenvectors which relate to the changes of the illumination on abstract features, a modified PCA (MPCA) algori...In principal component analysis (PCA) algorithms for face recognition, to reduce the influence of the eigenvectors which relate to the changes of the illumination on abstract features, a modified PCA (MPCA) algorithm is proposed. The method is based on the idea of reducing the influence of the eigenvectors associated with the large eigenvalues by normalizing the feature vector element by its corresponding standard deviation. The Yale face database and Yale face database B are used to verify the method. The simulation results show that, for front face and even under the condition of limited variation in the facial poses, the proposed method results in better performance than the conventional PCA and linear discriminant analysis (LDA) approaches, and the computational cost remains the same as that of the PCA, and much less than that of the LDA.展开更多
With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal...With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.展开更多
Matrix principal component analysis (MatPCA), as an effective feature extraction method, can deal with the matrix pattern and the vector pattern. However, like PCA, MatPCA does not use the class information of sampl...Matrix principal component analysis (MatPCA), as an effective feature extraction method, can deal with the matrix pattern and the vector pattern. However, like PCA, MatPCA does not use the class information of samples. As a result, the extracted features cannot provide enough useful information for distinguishing pat- tern from one another, and further resulting in degradation of classification performance. To fullly use class in- formation of samples, a novel method, called the fuzzy within-class MatPCA (F-WMatPCA)is proposed. F-WMatPCA utilizes the fuzzy K-nearest neighbor method(FKNN) to fuzzify the class membership degrees of a training sample and then performs fuzzy MatPCA within these patterns having the same class label. Due to more class information is used in feature extraction, F-WMatPCA can intuitively improve the classification perfor- mance. Experimental results in face databases and some benchmark datasets show that F-WMatPCA is effective and competitive than MatPCA. The experimental analysis on face image databases indicates that F-WMatPCA im- proves the recognition accuracy and is more stable and robust in performing classification than the existing method of fuzzy-based F-Fisherfaces.展开更多
Bagging is not quite suitable for stable classifiers such as nearest neighbor classifiers due to the lack of diversity and it is difficult to be directly applied to face recognition as well due to the small sample si...Bagging is not quite suitable for stable classifiers such as nearest neighbor classifiers due to the lack of diversity and it is difficult to be directly applied to face recognition as well due to the small sample size (SSS) property of face recognition. To solve the two problems,local Bagging (L-Bagging) is proposed to simultaneously make Bagging apply to both nearest neighbor classifiers and face recognition. The major difference between L-Bagging and Bagging is that L-Bagging performs the bootstrap sampling on each local region partitioned from the original face image rather than the whole face image. Since the dimensionality of local region is usually far less than the number of samples and the component classifiers are constructed just in different local regions,L-Bagging deals with SSS problem and generates more diverse component classifiers. Experimental results on four standard face image databases (AR,Yale,ORL and Yale B) indicate that the proposed L-Bagging method is effective and robust to illumination,occlusion and slight pose variation.展开更多
To improve the classification performance of the kernel minimum squared error( KMSE), an enhanced KMSE algorithm( EKMSE) is proposed. It redefines the regular objective function by introducing a novel class label ...To improve the classification performance of the kernel minimum squared error( KMSE), an enhanced KMSE algorithm( EKMSE) is proposed. It redefines the regular objective function by introducing a novel class label definition, and the relative class label matrix can be adaptively adjusted to the kernel matrix.Compared with the common methods, the newobjective function can enlarge the distance between different classes, which therefore yields better recognition rates. In addition, an iteration parameter searching technique is adopted to improve the computational efficiency. The extensive experiments on FERET and GT face databases illustrate the feasibility and efficiency of the proposed EKMSE. It outperforms the original MSE, KMSE,some KMSE improvement methods, and even the sparse representation-based techniques in face recognition, such as collaborate representation classification( CRC).展开更多
人脸识别技术广泛应用于考勤管理、移动支付等智慧建设中。伴随着常态化的口罩干扰,传统人脸识别算法已无法满足实际应用需求,为此,本文利用深度学习模型SSD以及FaceNet模型对人脸识别系统展开设计。首先,为消除现有数据集中亚洲人脸占...人脸识别技术广泛应用于考勤管理、移动支付等智慧建设中。伴随着常态化的口罩干扰,传统人脸识别算法已无法满足实际应用需求,为此,本文利用深度学习模型SSD以及FaceNet模型对人脸识别系统展开设计。首先,为消除现有数据集中亚洲人脸占比小造成的类内间距变化差距不明显的问题,在CAS-IA Web Face公开数据集的基础上对亚洲人脸数据进行扩充;其次,为解决不同口罩样式对特征提取的干扰,使用SSD人脸检测模型与DLIB人脸关键点检测模型提取人脸关键点,并利用人脸关键点与口罩的空间位置关系,额外随机生成不同的口罩人脸,组成混合数据集;最后,在混合数据集上进行模型训练并将训练好的模型移植到人脸识别系统中,进行检测速度与识别精度验证。实验结果表明,系统的实时识别速度达20 fps以上,人脸识别模型准确率在构建的混合数据集中达到97.1%,在随机抽取的部分LFW数据集验证的准确率达99.7%,故而该系统可满足实际应用需求,在一定程度上提高人脸识别的鲁棒性与准确性。展开更多
Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by...Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by changes in lighting, expression, viewpoint, aging and introduction of occlusion. Although various algorithms have been presented for face recognition, face recognition is still a very challenging topic. A novel approach of real time face recognition for HCI is proposed in the paper. In view of the limits of the popular approaches to foreground segmentation, wavelet multi-scale transform based background subtraction is developed to extract foreground objects. The optimal selection of the threshold is automatically determined, which does not require any complex supervised training or manual experimental calibration. A robust real time face recognition algorithm is presented, which combines the projection matrixes without iteration and kernel Fisher discriminant analysis (KFDA) to overcome some difficulties existing in the real face recognition. Superior performance of the proposed algorithm is demonstrated by comparing with other algorithms through experiments. The proposed algorithm can also be applied to the video image sequences of natural HCI.展开更多
With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communicati...With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.展开更多
Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with ...Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-展开更多
Dimensionality reduction methods play an important role in face recognition. Principal component analysis(PCA) and two-dimensional principal component analysis(2DPCA) are two kinds of important methods in this field. ...Dimensionality reduction methods play an important role in face recognition. Principal component analysis(PCA) and two-dimensional principal component analysis(2DPCA) are two kinds of important methods in this field. Recent research seems like that 2DPCA method is superior to PCA method. To prove if this conclusion is always true, a comprehensive comparison study between PCA and 2DPCA methods was carried out. A novel concept, called column-image difference(CID), was proposed to analyze the difference between PCA and 2DPCA methods in theory. It is found that there exist some restrictive conditions when2 DPCA outperforms PCA. After theoretical analysis, the experiments were conducted on four famous face image databases. The experiment results confirm the validity of theoretical claim.展开更多
Although real-world experiences show that preparing one image per person is more convenient, most of the appearance-based face recognition methods degrade or fail to work if there is only a single sample per person(SS...Although real-world experiences show that preparing one image per person is more convenient, most of the appearance-based face recognition methods degrade or fail to work if there is only a single sample per person(SSPP). In this work, we introduce a novel supervised learning method called supervised locality preserving multimanifold(SLPMM) for face recognition with SSPP. In SLPMM, two graphs: within-manifold graph and between-manifold graph are made to represent the information inside every manifold and the information among different manifolds, respectively. SLPMM simultaneously maximizes the between-manifold scatter and minimizes the within-manifold scatter which leads to discriminant space by adopting locality preserving projection(LPP) concept. Experimental results on two widely used face databases FERET and AR face database are presented to prove the efficacy of the proposed approach.展开更多
基金supported by National Natural Science Foundation of China(60802069,61273270)the Fundamental Research Funds for the Central Universities of China+1 种基金Natural Science Foundation of Guangdong Province(2014A030313173)Science and Technology Program of Guangzhou(2014Y2-00165,2014J4100114,2014J4100095)
基金Supported by the National Natural Science Foundation of China(6117015161070133+2 种基金60903130)the Natural Science Research Project of Higher Education of Jiangsu Province(12KJB520018)the Research Foundation of Nanjing University of Aeronautics and Astronautics(NP2011030)
文摘Canonical correlation analysis ( CCA ) based methods for low-resolution ( LR ) face recognition involve face images with different resolutions ( or multi-resolutions ), i.e.LR and high-resolution ( HR ) .For single-resolution face recognition , researchers have shown that utilizing spatial information is beneficial to improving the recognition accuracy , mainly because the pixels of each face are not independent but spatially correlated.However , for a multi-resolution scenario , there are no related works.Therefore , a method named spatial regularization of canonical correlation analysis ( SRCCA ) is developed for LR face recognition to improve the performance of CCA by the regularization utilizing spatial information of different resolution faces.Furthermore , the impact of LR and HR spatial regularization terms on LR face recognition is analyzed through experiments.
文摘Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in sparse representation means that only a few of instances selected from all training samples can effectively convey the essential class-specific information of the test sample,which is very important for classification.For deformable images such as human faces,pixels at the same location of different images of the same subject usually have different intensities.Therefore,extracting features and correctly classifying such deformable objects is very hard.Moreover,the lighting,attitude and occlusion cause more difficulty.Considering the problems and challenges listed above,a novel image representation and classification algorithm is proposed.First,the authors’algorithm generates virtual samples by a non-linear variation method.This method can effectively extract the low-frequency information of space-domain features of the original image,which is very useful for representing deformable objects.The combination of the original and virtual samples is more beneficial to improve the clas-sification performance and robustness of the algorithm.Thereby,the authors’algorithm calculates the expression coefficients of the original and virtual samples separately using the sparse representation principle and obtains the final score by a designed efficient score fusion scheme.The weighting coefficients in the score fusion scheme are set entirely automatically.Finally,the algorithm classifies the samples based on the final scores.The experimental results show that our method performs better classification than conventional sparse representation algorithms.
基金funding for the project,excluding research publication,from the Board of Research in Nuclear Sciences(BRNS)under Grant Number 59/14/05/2019/BRNS.
文摘Identifying faces in non-frontal poses presents a significant challenge for face recognition(FR)systems.In this study,we delved into the impact of yaw pose variations on these systems and devised a robust method for detecting faces across a wide range of angles from 0°to±90°.We initially selected the most suitable feature vector size by integrating the Dlib,FaceNet(Inception-v2),and“Support Vector Machines(SVM)”+“K-nearest neighbors(KNN)”algorithms.To train and evaluate this feature vector,we used two datasets:the“Labeled Faces in the Wild(LFW)”benchmark data and the“Robust Shape-Based FR System(RSBFRS)”real-time data,which contained face images with varying yaw poses.After selecting the best feature vector,we developed a real-time FR system to handle yaw poses.The proposed FaceNet architecture achieved recognition accuracies of 99.7%and 99.8%for the LFW and RSBFRS datasets,respectively,with 128 feature vector dimensions and minimum Euclidean distance thresholds of 0.06 and 0.12.The FaceNet+SVM and FaceNet+KNN classifiers achieved classification accuracies of 99.26%and 99.44%,respectively.The 128-dimensional embedding vector showed the highest recognition rate among all dimensions.These results demonstrate the effectiveness of our proposed approach in enhancing FR accuracy,particularly in real-world scenarios with varying yaw poses.
文摘Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on handcrafted features and classification models trained on image or video datasets,recent strides in artificial intelligence and deep learning(DL)have ushered in more sophisticated approaches.The research aims to develop a FER system using a Faster Region Convolutional Neural Network(FRCNN)and design a specialized FRCNN architecture tailored for facial emotion recognition,leveraging its ability to capture spatial hierarchies within localized regions of facial features.The proposed work enhances the accuracy and efficiency of facial emotion recognition.The proposed work comprises twomajor key components:Inception V3-based feature extraction and FRCNN-based emotion categorization.Extensive experimentation on Kaggle datasets validates the effectiveness of the proposed strategy,showcasing the FRCNN approach’s resilience and accuracy in identifying and categorizing facial expressions.The model’s overall performance metrics are compelling,with an accuracy of 98.4%,precision of 97.2%,and recall of 96.31%.This work introduces a perceptive deep learning-based FER method,contributing to the evolving landscape of emotion recognition technologies.The high accuracy and resilience demonstrated by the FRCNN approach underscore its potential for real-world applications.This research advances the field of FER and presents a compelling case for the practicality and efficacy of deep learning models in automating the understanding of facial emotions.
基金Princess Nourah bint Abdulrahman University Riyadh,Saudi Arabia with Researchers Supporting Project Number:PNURSP2024R234.
文摘Face recognition (FR) technology has numerous applications in artificial intelligence including biometrics, security,authentication, law enforcement, and surveillance. Deep learning (DL) models, notably convolutional neuralnetworks (CNNs), have shown promising results in the field of FR. However CNNs are easily fooled since theydo not encode position and orientation correlations between features. Hinton et al. envisioned Capsule Networksas a more robust design capable of retaining pose information and spatial correlations to recognize objects morelike the brain does. Lower-level capsules hold 8-dimensional vectors of attributes like position, hue, texture, andso on, which are routed to higher-level capsules via a new routing by agreement algorithm. This provides capsulenetworks with viewpoint invariance, which has previously evaded CNNs. This research presents a FR model basedon capsule networks that was tested using the LFW dataset, COMSATS face dataset, and own acquired photos usingcameras measuring 128 × 128 pixels, 40 × 40 pixels, and 30 × 30 pixels. The trained model outperforms state-ofthe-art algorithms, achieving 95.82% test accuracy and performing well on unseen faces that have been blurred orrotated. Additionally, the suggested model outperformed the recently released approaches on the COMSATS facedataset, achieving a high accuracy of 92.47%. Based on the results of this research as well as previous results, capsulenetworks perform better than deeper CNNs on unobserved altered data because of their special equivarianceproperties.
文摘Corona virus(COVID-19)is once in a life time calamity that has resulted in thousands of deaths and security concerns.People are using face masks on a regular basis to protect themselves and to help reduce corona virus transmission.During the on-going coronavirus outbreak,one of the major priorities for researchers is to discover effective solution.As important parts of the face are obscured,face identification and verification becomes exceedingly difficult.The suggested method is a transfer learning using MobileNet V2 based technology that uses deep feature such as feature extraction and deep learning model,to identify the problem of face masked identification.In the first stage,we are applying face mask detector to identify the face mask.Then,the proposed approach is applying to the datasets from Canadian Institute for Advanced Research10(CIFAR10),Modified National Institute of Standards and Technology Database(MNIST),Real World Masked Face Recognition Database(RMFRD),and Stimulated Masked Face Recognition Database(SMFRD).The proposed model is achieving recognition accuracy 99.82%with proposed dataset.This article employs the four pre-programmed models VGG16,VGG19,ResNet50 and ResNet101.To extract the deep features of faces with VGG16 is achieving 99.30%accuracy,VGG19 is achieving 99.54%accuracy,ResNet50 is achieving 78.70%accuracy and ResNet101 is achieving 98.64%accuracy with own dataset.The comparative analysis shows,that our proposed model performs better result in all four previous existing models.The fundamental contribution of this study is to monitor with face mask and without face mask to decreases the pace of corona virus and to detect persons using wearing face masks.
文摘Convolutional neural networks continually evolve to enhance accuracy in addressing various problems,leading to an increase in computational cost and model size.This paper introduces a novel approach for pruning face recognition models based on convolutional neural networks.The proposed method identifies and removes inefficient filters based on the information volume in feature maps.In each layer,some feature maps lack useful information,and there exists a correlation between certain feature maps.Filters associated with these two types of feature maps impose additional computational costs on the model.By eliminating filters related to these categories of feature maps,the reduction of both computational cost and model size can be achieved.The approach employs a combination of correlation analysis and the summation of matrix elements within each feature map to detect and eliminate inefficient filters.The method was applied to two face recognition models utilizing the VGG16 and ResNet50V2 backbone architectures.In the proposed approach,the number of filters removed in each layer varies,and the removal process is independent of the adjacent layers.The convolutional layers of both backbone models were initialized with pre-trained weights from ImageNet.For training,the CASIA-WebFace dataset was utilized,and the Labeled Faces in the Wild(LFW)dataset was employed for benchmarking purposes.In the VGG16-based face recognition model,a 0.74%accuracy improvement was achieved while reducing the number of convolution parameters by 26.85%and decreasing Floating-point operations per second(FLOPs)by 47.96%.For the face recognition model based on the ResNet50V2 architecture,the ArcFace method was implemented.The removal of inactive filters in this model led to a slight decrease in accuracy by 0.11%.However,it resulted in enhanced training speed,a reduction of 59.38%in convolution parameters,and a 57.29%decrease in FLOPs.
文摘A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techniques. The algorithm is able to track the human face against a complex background and also works well when temporary occlusion occurs. We also obtain a very high recognition rate by averaging a number of samples over a long image sequence. The proposed approach has been successfully tested by many experiments, and can operate at 20 frames/s on an 800 MHz PC.
文摘In principal component analysis (PCA) algorithms for face recognition, to reduce the influence of the eigenvectors which relate to the changes of the illumination on abstract features, a modified PCA (MPCA) algorithm is proposed. The method is based on the idea of reducing the influence of the eigenvectors associated with the large eigenvalues by normalizing the feature vector element by its corresponding standard deviation. The Yale face database and Yale face database B are used to verify the method. The simulation results show that, for front face and even under the condition of limited variation in the facial poses, the proposed method results in better performance than the conventional PCA and linear discriminant analysis (LDA) approaches, and the computational cost remains the same as that of the PCA, and much less than that of the LDA.
文摘With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.
文摘Matrix principal component analysis (MatPCA), as an effective feature extraction method, can deal with the matrix pattern and the vector pattern. However, like PCA, MatPCA does not use the class information of samples. As a result, the extracted features cannot provide enough useful information for distinguishing pat- tern from one another, and further resulting in degradation of classification performance. To fullly use class in- formation of samples, a novel method, called the fuzzy within-class MatPCA (F-WMatPCA)is proposed. F-WMatPCA utilizes the fuzzy K-nearest neighbor method(FKNN) to fuzzify the class membership degrees of a training sample and then performs fuzzy MatPCA within these patterns having the same class label. Due to more class information is used in feature extraction, F-WMatPCA can intuitively improve the classification perfor- mance. Experimental results in face databases and some benchmark datasets show that F-WMatPCA is effective and competitive than MatPCA. The experimental analysis on face image databases indicates that F-WMatPCA im- proves the recognition accuracy and is more stable and robust in performing classification than the existing method of fuzzy-based F-Fisherfaces.
文摘Bagging is not quite suitable for stable classifiers such as nearest neighbor classifiers due to the lack of diversity and it is difficult to be directly applied to face recognition as well due to the small sample size (SSS) property of face recognition. To solve the two problems,local Bagging (L-Bagging) is proposed to simultaneously make Bagging apply to both nearest neighbor classifiers and face recognition. The major difference between L-Bagging and Bagging is that L-Bagging performs the bootstrap sampling on each local region partitioned from the original face image rather than the whole face image. Since the dimensionality of local region is usually far less than the number of samples and the component classifiers are constructed just in different local regions,L-Bagging deals with SSS problem and generates more diverse component classifiers. Experimental results on four standard face image databases (AR,Yale,ORL and Yale B) indicate that the proposed L-Bagging method is effective and robust to illumination,occlusion and slight pose variation.
基金The Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)the National Natural Science Foundation of China(No.61572258,61103141,51405241)+1 种基金the Natural Science Foundation of Jiangsu Province(No.BK20151530)Overseas Training Programs for Outstanding Young Scholars of Universities in Jiangsu Province
文摘To improve the classification performance of the kernel minimum squared error( KMSE), an enhanced KMSE algorithm( EKMSE) is proposed. It redefines the regular objective function by introducing a novel class label definition, and the relative class label matrix can be adaptively adjusted to the kernel matrix.Compared with the common methods, the newobjective function can enlarge the distance between different classes, which therefore yields better recognition rates. In addition, an iteration parameter searching technique is adopted to improve the computational efficiency. The extensive experiments on FERET and GT face databases illustrate the feasibility and efficiency of the proposed EKMSE. It outperforms the original MSE, KMSE,some KMSE improvement methods, and even the sparse representation-based techniques in face recognition, such as collaborate representation classification( CRC).
文摘人脸识别技术广泛应用于考勤管理、移动支付等智慧建设中。伴随着常态化的口罩干扰,传统人脸识别算法已无法满足实际应用需求,为此,本文利用深度学习模型SSD以及FaceNet模型对人脸识别系统展开设计。首先,为消除现有数据集中亚洲人脸占比小造成的类内间距变化差距不明显的问题,在CAS-IA Web Face公开数据集的基础上对亚洲人脸数据进行扩充;其次,为解决不同口罩样式对特征提取的干扰,使用SSD人脸检测模型与DLIB人脸关键点检测模型提取人脸关键点,并利用人脸关键点与口罩的空间位置关系,额外随机生成不同的口罩人脸,组成混合数据集;最后,在混合数据集上进行模型训练并将训练好的模型移植到人脸识别系统中,进行检测速度与识别精度验证。实验结果表明,系统的实时识别速度达20 fps以上,人脸识别模型准确率在构建的混合数据集中达到97.1%,在随机抽取的部分LFW数据集验证的准确率达99.7%,故而该系统可满足实际应用需求,在一定程度上提高人脸识别的鲁棒性与准确性。
基金supported by the National Natural Science Foundation of China (Grant No.60872117)the Leading Academic Discipline Project of Shanghai Municipal Education Commission (Grant No.J50104)
文摘Face recognition provides a natural visual interface for human computer interaction (HCI) applications. The process of face recognition, however, is inhibited by variations in the appearance of face images caused by changes in lighting, expression, viewpoint, aging and introduction of occlusion. Although various algorithms have been presented for face recognition, face recognition is still a very challenging topic. A novel approach of real time face recognition for HCI is proposed in the paper. In view of the limits of the popular approaches to foreground segmentation, wavelet multi-scale transform based background subtraction is developed to extract foreground objects. The optimal selection of the threshold is automatically determined, which does not require any complex supervised training or manual experimental calibration. A robust real time face recognition algorithm is presented, which combines the projection matrixes without iteration and kernel Fisher discriminant analysis (KFDA) to overcome some difficulties existing in the real face recognition. Superior performance of the proposed algorithm is demonstrated by comparing with other algorithms through experiments. The proposed algorithm can also be applied to the video image sequences of natural HCI.
文摘With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.
基金This project was supported by Shanghai Shu Guang Project.
文摘Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an al-
基金Projects(50275150,61173052)supported by the National Natural Science Foundation of China
文摘Dimensionality reduction methods play an important role in face recognition. Principal component analysis(PCA) and two-dimensional principal component analysis(2DPCA) are two kinds of important methods in this field. Recent research seems like that 2DPCA method is superior to PCA method. To prove if this conclusion is always true, a comprehensive comparison study between PCA and 2DPCA methods was carried out. A novel concept, called column-image difference(CID), was proposed to analyze the difference between PCA and 2DPCA methods in theory. It is found that there exist some restrictive conditions when2 DPCA outperforms PCA. After theoretical analysis, the experiments were conducted on four famous face image databases. The experiment results confirm the validity of theoretical claim.
文摘Although real-world experiences show that preparing one image per person is more convenient, most of the appearance-based face recognition methods degrade or fail to work if there is only a single sample per person(SSPP). In this work, we introduce a novel supervised learning method called supervised locality preserving multimanifold(SLPMM) for face recognition with SSPP. In SLPMM, two graphs: within-manifold graph and between-manifold graph are made to represent the information inside every manifold and the information among different manifolds, respectively. SLPMM simultaneously maximizes the between-manifold scatter and minimizes the within-manifold scatter which leads to discriminant space by adopting locality preserving projection(LPP) concept. Experimental results on two widely used face databases FERET and AR face database are presented to prove the efficacy of the proposed approach.