Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on han...Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on handcrafted features and classification models trained on image or video datasets,recent strides in artificial intelligence and deep learning(DL)have ushered in more sophisticated approaches.The research aims to develop a FER system using a Faster Region Convolutional Neural Network(FRCNN)and design a specialized FRCNN architecture tailored for facial emotion recognition,leveraging its ability to capture spatial hierarchies within localized regions of facial features.The proposed work enhances the accuracy and efficiency of facial emotion recognition.The proposed work comprises twomajor key components:Inception V3-based feature extraction and FRCNN-based emotion categorization.Extensive experimentation on Kaggle datasets validates the effectiveness of the proposed strategy,showcasing the FRCNN approach’s resilience and accuracy in identifying and categorizing facial expressions.The model’s overall performance metrics are compelling,with an accuracy of 98.4%,precision of 97.2%,and recall of 96.31%.This work introduces a perceptive deep learning-based FER method,contributing to the evolving landscape of emotion recognition technologies.The high accuracy and resilience demonstrated by the FRCNN approach underscore its potential for real-world applications.This research advances the field of FER and presents a compelling case for the practicality and efficacy of deep learning models in automating the understanding of facial emotions.展开更多
Face recognition (FR) technology has numerous applications in artificial intelligence including biometrics, security,authentication, law enforcement, and surveillance. Deep learning (DL) models, notably convolutional ...Face recognition (FR) technology has numerous applications in artificial intelligence including biometrics, security,authentication, law enforcement, and surveillance. Deep learning (DL) models, notably convolutional neuralnetworks (CNNs), have shown promising results in the field of FR. However CNNs are easily fooled since theydo not encode position and orientation correlations between features. Hinton et al. envisioned Capsule Networksas a more robust design capable of retaining pose information and spatial correlations to recognize objects morelike the brain does. Lower-level capsules hold 8-dimensional vectors of attributes like position, hue, texture, andso on, which are routed to higher-level capsules via a new routing by agreement algorithm. This provides capsulenetworks with viewpoint invariance, which has previously evaded CNNs. This research presents a FR model basedon capsule networks that was tested using the LFW dataset, COMSATS face dataset, and own acquired photos usingcameras measuring 128 × 128 pixels, 40 × 40 pixels, and 30 × 30 pixels. The trained model outperforms state-ofthe-art algorithms, achieving 95.82% test accuracy and performing well on unseen faces that have been blurred orrotated. Additionally, the suggested model outperformed the recently released approaches on the COMSATS facedataset, achieving a high accuracy of 92.47%. Based on the results of this research as well as previous results, capsulenetworks perform better than deeper CNNs on unobserved altered data because of their special equivarianceproperties.展开更多
Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in ...Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in sparse representation means that only a few of instances selected from all training samples can effectively convey the essential class-specific information of the test sample,which is very important for classification.For deformable images such as human faces,pixels at the same location of different images of the same subject usually have different intensities.Therefore,extracting features and correctly classifying such deformable objects is very hard.Moreover,the lighting,attitude and occlusion cause more difficulty.Considering the problems and challenges listed above,a novel image representation and classification algorithm is proposed.First,the authors’algorithm generates virtual samples by a non-linear variation method.This method can effectively extract the low-frequency information of space-domain features of the original image,which is very useful for representing deformable objects.The combination of the original and virtual samples is more beneficial to improve the clas-sification performance and robustness of the algorithm.Thereby,the authors’algorithm calculates the expression coefficients of the original and virtual samples separately using the sparse representation principle and obtains the final score by a designed efficient score fusion scheme.The weighting coefficients in the score fusion scheme are set entirely automatically.Finally,the algorithm classifies the samples based on the final scores.The experimental results show that our method performs better classification than conventional sparse representation algorithms.展开更多
Corona virus(COVID-19)is once in a life time calamity that has resulted in thousands of deaths and security concerns.People are using face masks on a regular basis to protect themselves and to help reduce corona virus...Corona virus(COVID-19)is once in a life time calamity that has resulted in thousands of deaths and security concerns.People are using face masks on a regular basis to protect themselves and to help reduce corona virus transmission.During the on-going coronavirus outbreak,one of the major priorities for researchers is to discover effective solution.As important parts of the face are obscured,face identification and verification becomes exceedingly difficult.The suggested method is a transfer learning using MobileNet V2 based technology that uses deep feature such as feature extraction and deep learning model,to identify the problem of face masked identification.In the first stage,we are applying face mask detector to identify the face mask.Then,the proposed approach is applying to the datasets from Canadian Institute for Advanced Research10(CIFAR10),Modified National Institute of Standards and Technology Database(MNIST),Real World Masked Face Recognition Database(RMFRD),and Stimulated Masked Face Recognition Database(SMFRD).The proposed model is achieving recognition accuracy 99.82%with proposed dataset.This article employs the four pre-programmed models VGG16,VGG19,ResNet50 and ResNet101.To extract the deep features of faces with VGG16 is achieving 99.30%accuracy,VGG19 is achieving 99.54%accuracy,ResNet50 is achieving 78.70%accuracy and ResNet101 is achieving 98.64%accuracy with own dataset.The comparative analysis shows,that our proposed model performs better result in all four previous existing models.The fundamental contribution of this study is to monitor with face mask and without face mask to decreases the pace of corona virus and to detect persons using wearing face masks.展开更多
人脸识别技术广泛应用于考勤管理、移动支付等智慧建设中。伴随着常态化的口罩干扰,传统人脸识别算法已无法满足实际应用需求,为此,本文利用深度学习模型SSD以及FaceNet模型对人脸识别系统展开设计。首先,为消除现有数据集中亚洲人脸占...人脸识别技术广泛应用于考勤管理、移动支付等智慧建设中。伴随着常态化的口罩干扰,传统人脸识别算法已无法满足实际应用需求,为此,本文利用深度学习模型SSD以及FaceNet模型对人脸识别系统展开设计。首先,为消除现有数据集中亚洲人脸占比小造成的类内间距变化差距不明显的问题,在CAS-IA Web Face公开数据集的基础上对亚洲人脸数据进行扩充;其次,为解决不同口罩样式对特征提取的干扰,使用SSD人脸检测模型与DLIB人脸关键点检测模型提取人脸关键点,并利用人脸关键点与口罩的空间位置关系,额外随机生成不同的口罩人脸,组成混合数据集;最后,在混合数据集上进行模型训练并将训练好的模型移植到人脸识别系统中,进行检测速度与识别精度验证。实验结果表明,系统的实时识别速度达20 fps以上,人脸识别模型准确率在构建的混合数据集中达到97.1%,在随机抽取的部分LFW数据集验证的准确率达99.7%,故而该系统可满足实际应用需求,在一定程度上提高人脸识别的鲁棒性与准确性。展开更多
Convolutional neural networks continually evolve to enhance accuracy in addressing various problems,leading to an increase in computational cost and model size.This paper introduces a novel approach for pruning face r...Convolutional neural networks continually evolve to enhance accuracy in addressing various problems,leading to an increase in computational cost and model size.This paper introduces a novel approach for pruning face recognition models based on convolutional neural networks.The proposed method identifies and removes inefficient filters based on the information volume in feature maps.In each layer,some feature maps lack useful information,and there exists a correlation between certain feature maps.Filters associated with these two types of feature maps impose additional computational costs on the model.By eliminating filters related to these categories of feature maps,the reduction of both computational cost and model size can be achieved.The approach employs a combination of correlation analysis and the summation of matrix elements within each feature map to detect and eliminate inefficient filters.The method was applied to two face recognition models utilizing the VGG16 and ResNet50V2 backbone architectures.In the proposed approach,the number of filters removed in each layer varies,and the removal process is independent of the adjacent layers.The convolutional layers of both backbone models were initialized with pre-trained weights from ImageNet.For training,the CASIA-WebFace dataset was utilized,and the Labeled Faces in the Wild(LFW)dataset was employed for benchmarking purposes.In the VGG16-based face recognition model,a 0.74%accuracy improvement was achieved while reducing the number of convolution parameters by 26.85%and decreasing Floating-point operations per second(FLOPs)by 47.96%.For the face recognition model based on the ResNet50V2 architecture,the ArcFace method was implemented.The removal of inactive filters in this model led to a slight decrease in accuracy by 0.11%.However,it resulted in enhanced training speed,a reduction of 59.38%in convolution parameters,and a 57.29%decrease in FLOPs.展开更多
In recent years, deep networks has achieved outstanding performance in computer vision, especially in the field of face recognition. In terms of the performance for a face recognition model based on deep network, ther...In recent years, deep networks has achieved outstanding performance in computer vision, especially in the field of face recognition. In terms of the performance for a face recognition model based on deep network, there are two main closely related factors: 1) the structure of the deep neural network, and 2) the number and quality of training data. In real applications, illumination change is one of the most important factors that significantly affect the performance of face recognition algorithms. As for deep network models, only if there is sufficient training data that has various illumination intensity could they achieve expected performance. However, such kind of training data is hard to collect in the real world. In this paper, focusing on the illumination change challenge, we propose a deep network model which takes both visible light image and near-infrared image into account to perform face recognition. Near- infrared image, as we know, is much less sensitive to illuminations. Visible light face image contains abundant texture information which is very useful for face recognition. Thus, we design an adaptive score fusion strategy which hardly has information loss and the nearest neighbor algorithm to conduct the final classification. The experimental results demonstrate that the model is very effective in realworld scenarios and perform much better in terms of illumination change than other state-of-the-art models.展开更多
Face recognition from video requires dealing with uncertainty both in tracking and recognition. This paper proposed an effective method for face recognition from video. In order to realize simultaneous tracking and re...Face recognition from video requires dealing with uncertainty both in tracking and recognition. This paper proposed an effective method for face recognition from video. In order to realize simultaneous tracking and recognition, fisherface-based recognition is combined with tracking into one model. This model is then embedded into particle filter to perform face recognition from video. In order to improve the robustness of tracking, an expectation maximization (EM) algorithm was adopted to update the appearance model. The experimental results show that the proposed method can perform well in tracking and recognition even in poor conditions such as occlusion and remarkable change in lighting.展开更多
In this paper,a novel face recognition method,named as wavelet-curvelet-fractal technique,is proposed. Based on the similarities embedded in the images,we propose to utilize the wave-let-curvelet-fractal technique to ...In this paper,a novel face recognition method,named as wavelet-curvelet-fractal technique,is proposed. Based on the similarities embedded in the images,we propose to utilize the wave-let-curvelet-fractal technique to extract facial features. Thus we have the wavelet’s details in diagonal,vertical,and horizontal directions,and the eight curvelet details at different angles. Then we adopt the Euclidean minimum distance classifier to recognize different faces. Extensive comparison tests on dif-ferent data sets are carried out,and higher recognition rate is obtained by the proposed technique.展开更多
In this paper human face machine identification is experienced using optical correlation techniques in spatial frequency domain. This approach is tested on ORL dataset of faces which includes face images of 40 subject...In this paper human face machine identification is experienced using optical correlation techniques in spatial frequency domain. This approach is tested on ORL dataset of faces which includes face images of 40 subjects, each in 10 different positions. The examined optical setup relies on optical correlation based on developing optical Vanderlugt filters and its basics are described in this article. With the limitation of face database of 40 persons, the recognition is examined successfully with nearly 100% of accuracy in matching the input images with their respective Vanderlugt synthesized filters. Software simulation is implemented by using MATLAB for face identification.展开更多
Aimed at the problems of infrared image recognition under varying illumination,face disguise,etc.,we bring out an infrared human face recognition algorithm based on 2DPCA.The proposed algorithm can work out the covari...Aimed at the problems of infrared image recognition under varying illumination,face disguise,etc.,we bring out an infrared human face recognition algorithm based on 2DPCA.The proposed algorithm can work out the covariance matrix of the training sample easily and directly;at the same time,it costs less time to work out the eigenvector.Relevant experiments are carried out,and the result indicates that compared with the traditional recognition algorithm,the proposed recognition method is swift and has a good adaptability to the changes of human face posture.展开更多
Near infrared-visible(NIR-VIS)face recognition is to match an NIR face image to a VIS image.The main challenges of NIR-VIS face recognition are the gap caused by cross-modality and the lack of sufficient paired NIR-VI...Near infrared-visible(NIR-VIS)face recognition is to match an NIR face image to a VIS image.The main challenges of NIR-VIS face recognition are the gap caused by cross-modality and the lack of sufficient paired NIR-VIS face images to train models.This paper focuses on the generation of paired NIR-VIS face images and proposes a dual variational generator based on ResNeSt(RS-DVG).RS-DVG can generate a large number of paired NIR-VIS face images from noise,and these generated NIR-VIS face images can be used as the training set together with the real NIR-VIS face images.In addition,a triplet loss function is introduced and a novel triplet selection method is proposed specifically for the training of the current face recognition model,which maximizes the inter-class distance and minimizes the intra-class distance in the input face images.The method proposed in this paper was evaluated on the datasets CASIA NIR-VIS 2.0 and BUAA-VisNir,and relatively good results were obtained.展开更多
In this work, we consider a homotopic principle for solving large-scale and dense l1underdetermined problems and its applications in image processing and classification. We solve the face recognition problem where the...In this work, we consider a homotopic principle for solving large-scale and dense l1underdetermined problems and its applications in image processing and classification. We solve the face recognition problem where the input image contains corrupted and/or lost pixels. The approach involves two steps: first, the incomplete or corrupted image is subject to an inpainting process, and secondly, the restored image is used to carry out the classification or recognition task. Addressing these two steps involves solving large scale l1minimization problems. To that end, we propose to solve a sequence of linear equality constrained multiquadric problems that depends on a regularization parameter that converges to zero. The procedure generates a central path that converges to a point on the solution set of the l1underdetermined problem. In order to solve each subproblem, a conjugate gradient algorithm is formulated. When noise is present in the model, inexact directions are taken so that an approximate solution is computed faster. This prevents the ill conditioning produced when the conjugate gradient is required to iterate until a zero residual is attained.展开更多
A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techni...A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techniques. The algorithm is able to track the human face against a complex background and also works well when temporary occlusion occurs. We also obtain a very high recognition rate by averaging a number of samples over a long image sequence. The proposed approach has been successfully tested by many experiments, and can operate at 20 frames/s on an 800 MHz PC.展开更多
In principal component analysis (PCA) algorithms for face recognition, to reduce the influence of the eigenvectors which relate to the changes of the illumination on abstract features, a modified PCA (MPCA) algori...In principal component analysis (PCA) algorithms for face recognition, to reduce the influence of the eigenvectors which relate to the changes of the illumination on abstract features, a modified PCA (MPCA) algorithm is proposed. The method is based on the idea of reducing the influence of the eigenvectors associated with the large eigenvalues by normalizing the feature vector element by its corresponding standard deviation. The Yale face database and Yale face database B are used to verify the method. The simulation results show that, for front face and even under the condition of limited variation in the facial poses, the proposed method results in better performance than the conventional PCA and linear discriminant analysis (LDA) approaches, and the computational cost remains the same as that of the PCA, and much less than that of the LDA.展开更多
With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal...With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.展开更多
Matrix principal component analysis (MatPCA), as an effective feature extraction method, can deal with the matrix pattern and the vector pattern. However, like PCA, MatPCA does not use the class information of sampl...Matrix principal component analysis (MatPCA), as an effective feature extraction method, can deal with the matrix pattern and the vector pattern. However, like PCA, MatPCA does not use the class information of samples. As a result, the extracted features cannot provide enough useful information for distinguishing pat- tern from one another, and further resulting in degradation of classification performance. To fullly use class in- formation of samples, a novel method, called the fuzzy within-class MatPCA (F-WMatPCA)is proposed. F-WMatPCA utilizes the fuzzy K-nearest neighbor method(FKNN) to fuzzify the class membership degrees of a training sample and then performs fuzzy MatPCA within these patterns having the same class label. Due to more class information is used in feature extraction, F-WMatPCA can intuitively improve the classification perfor- mance. Experimental results in face databases and some benchmark datasets show that F-WMatPCA is effective and competitive than MatPCA. The experimental analysis on face image databases indicates that F-WMatPCA im- proves the recognition accuracy and is more stable and robust in performing classification than the existing method of fuzzy-based F-Fisherfaces.展开更多
Bagging is not quite suitable for stable classifiers such as nearest neighbor classifiers due to the lack of diversity and it is difficult to be directly applied to face recognition as well due to the small sample si...Bagging is not quite suitable for stable classifiers such as nearest neighbor classifiers due to the lack of diversity and it is difficult to be directly applied to face recognition as well due to the small sample size (SSS) property of face recognition. To solve the two problems,local Bagging (L-Bagging) is proposed to simultaneously make Bagging apply to both nearest neighbor classifiers and face recognition. The major difference between L-Bagging and Bagging is that L-Bagging performs the bootstrap sampling on each local region partitioned from the original face image rather than the whole face image. Since the dimensionality of local region is usually far less than the number of samples and the component classifiers are constructed just in different local regions,L-Bagging deals with SSS problem and generates more diverse component classifiers. Experimental results on four standard face image databases (AR,Yale,ORL and Yale B) indicate that the proposed L-Bagging method is effective and robust to illumination,occlusion and slight pose variation.展开更多
To improve the classification performance of the kernel minimum squared error( KMSE), an enhanced KMSE algorithm( EKMSE) is proposed. It redefines the regular objective function by introducing a novel class label ...To improve the classification performance of the kernel minimum squared error( KMSE), an enhanced KMSE algorithm( EKMSE) is proposed. It redefines the regular objective function by introducing a novel class label definition, and the relative class label matrix can be adaptively adjusted to the kernel matrix.Compared with the common methods, the newobjective function can enlarge the distance between different classes, which therefore yields better recognition rates. In addition, an iteration parameter searching technique is adopted to improve the computational efficiency. The extensive experiments on FERET and GT face databases illustrate the feasibility and efficiency of the proposed EKMSE. It outperforms the original MSE, KMSE,some KMSE improvement methods, and even the sparse representation-based techniques in face recognition, such as collaborate representation classification( CRC).展开更多
文摘Facial emotion recognition(FER)has become a focal point of research due to its widespread applications,ranging from human-computer interaction to affective computing.While traditional FER techniques have relied on handcrafted features and classification models trained on image or video datasets,recent strides in artificial intelligence and deep learning(DL)have ushered in more sophisticated approaches.The research aims to develop a FER system using a Faster Region Convolutional Neural Network(FRCNN)and design a specialized FRCNN architecture tailored for facial emotion recognition,leveraging its ability to capture spatial hierarchies within localized regions of facial features.The proposed work enhances the accuracy and efficiency of facial emotion recognition.The proposed work comprises twomajor key components:Inception V3-based feature extraction and FRCNN-based emotion categorization.Extensive experimentation on Kaggle datasets validates the effectiveness of the proposed strategy,showcasing the FRCNN approach’s resilience and accuracy in identifying and categorizing facial expressions.The model’s overall performance metrics are compelling,with an accuracy of 98.4%,precision of 97.2%,and recall of 96.31%.This work introduces a perceptive deep learning-based FER method,contributing to the evolving landscape of emotion recognition technologies.The high accuracy and resilience demonstrated by the FRCNN approach underscore its potential for real-world applications.This research advances the field of FER and presents a compelling case for the practicality and efficacy of deep learning models in automating the understanding of facial emotions.
基金Princess Nourah bint Abdulrahman University Riyadh,Saudi Arabia with Researchers Supporting Project Number:PNURSP2024R234.
文摘Face recognition (FR) technology has numerous applications in artificial intelligence including biometrics, security,authentication, law enforcement, and surveillance. Deep learning (DL) models, notably convolutional neuralnetworks (CNNs), have shown promising results in the field of FR. However CNNs are easily fooled since theydo not encode position and orientation correlations between features. Hinton et al. envisioned Capsule Networksas a more robust design capable of retaining pose information and spatial correlations to recognize objects morelike the brain does. Lower-level capsules hold 8-dimensional vectors of attributes like position, hue, texture, andso on, which are routed to higher-level capsules via a new routing by agreement algorithm. This provides capsulenetworks with viewpoint invariance, which has previously evaded CNNs. This research presents a FR model basedon capsule networks that was tested using the LFW dataset, COMSATS face dataset, and own acquired photos usingcameras measuring 128 × 128 pixels, 40 × 40 pixels, and 30 × 30 pixels. The trained model outperforms state-ofthe-art algorithms, achieving 95.82% test accuracy and performing well on unseen faces that have been blurred orrotated. Additionally, the suggested model outperformed the recently released approaches on the COMSATS facedataset, achieving a high accuracy of 92.47%. Based on the results of this research as well as previous results, capsulenetworks perform better than deeper CNNs on unobserved altered data because of their special equivarianceproperties.
文摘Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in sparse representation means that only a few of instances selected from all training samples can effectively convey the essential class-specific information of the test sample,which is very important for classification.For deformable images such as human faces,pixels at the same location of different images of the same subject usually have different intensities.Therefore,extracting features and correctly classifying such deformable objects is very hard.Moreover,the lighting,attitude and occlusion cause more difficulty.Considering the problems and challenges listed above,a novel image representation and classification algorithm is proposed.First,the authors’algorithm generates virtual samples by a non-linear variation method.This method can effectively extract the low-frequency information of space-domain features of the original image,which is very useful for representing deformable objects.The combination of the original and virtual samples is more beneficial to improve the clas-sification performance and robustness of the algorithm.Thereby,the authors’algorithm calculates the expression coefficients of the original and virtual samples separately using the sparse representation principle and obtains the final score by a designed efficient score fusion scheme.The weighting coefficients in the score fusion scheme are set entirely automatically.Finally,the algorithm classifies the samples based on the final scores.The experimental results show that our method performs better classification than conventional sparse representation algorithms.
文摘Corona virus(COVID-19)is once in a life time calamity that has resulted in thousands of deaths and security concerns.People are using face masks on a regular basis to protect themselves and to help reduce corona virus transmission.During the on-going coronavirus outbreak,one of the major priorities for researchers is to discover effective solution.As important parts of the face are obscured,face identification and verification becomes exceedingly difficult.The suggested method is a transfer learning using MobileNet V2 based technology that uses deep feature such as feature extraction and deep learning model,to identify the problem of face masked identification.In the first stage,we are applying face mask detector to identify the face mask.Then,the proposed approach is applying to the datasets from Canadian Institute for Advanced Research10(CIFAR10),Modified National Institute of Standards and Technology Database(MNIST),Real World Masked Face Recognition Database(RMFRD),and Stimulated Masked Face Recognition Database(SMFRD).The proposed model is achieving recognition accuracy 99.82%with proposed dataset.This article employs the four pre-programmed models VGG16,VGG19,ResNet50 and ResNet101.To extract the deep features of faces with VGG16 is achieving 99.30%accuracy,VGG19 is achieving 99.54%accuracy,ResNet50 is achieving 78.70%accuracy and ResNet101 is achieving 98.64%accuracy with own dataset.The comparative analysis shows,that our proposed model performs better result in all four previous existing models.The fundamental contribution of this study is to monitor with face mask and without face mask to decreases the pace of corona virus and to detect persons using wearing face masks.
文摘人脸识别技术广泛应用于考勤管理、移动支付等智慧建设中。伴随着常态化的口罩干扰,传统人脸识别算法已无法满足实际应用需求,为此,本文利用深度学习模型SSD以及FaceNet模型对人脸识别系统展开设计。首先,为消除现有数据集中亚洲人脸占比小造成的类内间距变化差距不明显的问题,在CAS-IA Web Face公开数据集的基础上对亚洲人脸数据进行扩充;其次,为解决不同口罩样式对特征提取的干扰,使用SSD人脸检测模型与DLIB人脸关键点检测模型提取人脸关键点,并利用人脸关键点与口罩的空间位置关系,额外随机生成不同的口罩人脸,组成混合数据集;最后,在混合数据集上进行模型训练并将训练好的模型移植到人脸识别系统中,进行检测速度与识别精度验证。实验结果表明,系统的实时识别速度达20 fps以上,人脸识别模型准确率在构建的混合数据集中达到97.1%,在随机抽取的部分LFW数据集验证的准确率达99.7%,故而该系统可满足实际应用需求,在一定程度上提高人脸识别的鲁棒性与准确性。
文摘Convolutional neural networks continually evolve to enhance accuracy in addressing various problems,leading to an increase in computational cost and model size.This paper introduces a novel approach for pruning face recognition models based on convolutional neural networks.The proposed method identifies and removes inefficient filters based on the information volume in feature maps.In each layer,some feature maps lack useful information,and there exists a correlation between certain feature maps.Filters associated with these two types of feature maps impose additional computational costs on the model.By eliminating filters related to these categories of feature maps,the reduction of both computational cost and model size can be achieved.The approach employs a combination of correlation analysis and the summation of matrix elements within each feature map to detect and eliminate inefficient filters.The method was applied to two face recognition models utilizing the VGG16 and ResNet50V2 backbone architectures.In the proposed approach,the number of filters removed in each layer varies,and the removal process is independent of the adjacent layers.The convolutional layers of both backbone models were initialized with pre-trained weights from ImageNet.For training,the CASIA-WebFace dataset was utilized,and the Labeled Faces in the Wild(LFW)dataset was employed for benchmarking purposes.In the VGG16-based face recognition model,a 0.74%accuracy improvement was achieved while reducing the number of convolution parameters by 26.85%and decreasing Floating-point operations per second(FLOPs)by 47.96%.For the face recognition model based on the ResNet50V2 architecture,the ArcFace method was implemented.The removal of inactive filters in this model led to a slight decrease in accuracy by 0.11%.However,it resulted in enhanced training speed,a reduction of 59.38%in convolution parameters,and a 57.29%decrease in FLOPs.
文摘In recent years, deep networks has achieved outstanding performance in computer vision, especially in the field of face recognition. In terms of the performance for a face recognition model based on deep network, there are two main closely related factors: 1) the structure of the deep neural network, and 2) the number and quality of training data. In real applications, illumination change is one of the most important factors that significantly affect the performance of face recognition algorithms. As for deep network models, only if there is sufficient training data that has various illumination intensity could they achieve expected performance. However, such kind of training data is hard to collect in the real world. In this paper, focusing on the illumination change challenge, we propose a deep network model which takes both visible light image and near-infrared image into account to perform face recognition. Near- infrared image, as we know, is much less sensitive to illuminations. Visible light face image contains abundant texture information which is very useful for face recognition. Thus, we design an adaptive score fusion strategy which hardly has information loss and the nearest neighbor algorithm to conduct the final classification. The experimental results demonstrate that the model is very effective in realworld scenarios and perform much better in terms of illumination change than other state-of-the-art models.
基金National Natural Science Foundation of Chi-na(No.60375008)Shanghai Key Technolo-gies Pre-research Project(No.035115009)
文摘Face recognition from video requires dealing with uncertainty both in tracking and recognition. This paper proposed an effective method for face recognition from video. In order to realize simultaneous tracking and recognition, fisherface-based recognition is combined with tracking into one model. This model is then embedded into particle filter to perform face recognition from video. In order to improve the robustness of tracking, an expectation maximization (EM) algorithm was adopted to update the appearance model. The experimental results show that the proposed method can perform well in tracking and recognition even in poor conditions such as occlusion and remarkable change in lighting.
基金Supported by the College of Heilongjiang Province, Electronic Engineering Key Lab Project dzzd200602Heilongjiang Province Educational Bureau Scientific Technology Important Project 11531z18
文摘In this paper,a novel face recognition method,named as wavelet-curvelet-fractal technique,is proposed. Based on the similarities embedded in the images,we propose to utilize the wave-let-curvelet-fractal technique to extract facial features. Thus we have the wavelet’s details in diagonal,vertical,and horizontal directions,and the eight curvelet details at different angles. Then we adopt the Euclidean minimum distance classifier to recognize different faces. Extensive comparison tests on dif-ferent data sets are carried out,and higher recognition rate is obtained by the proposed technique.
文摘In this paper human face machine identification is experienced using optical correlation techniques in spatial frequency domain. This approach is tested on ORL dataset of faces which includes face images of 40 subjects, each in 10 different positions. The examined optical setup relies on optical correlation based on developing optical Vanderlugt filters and its basics are described in this article. With the limitation of face database of 40 persons, the recognition is examined successfully with nearly 100% of accuracy in matching the input images with their respective Vanderlugt synthesized filters. Software simulation is implemented by using MATLAB for face identification.
基金Sponsored by the Natural Science Fund of Heilongjiang province(Grant No. F2007-13)Science and Technology Research Projects in Office of Education of Heilongjiang province(Grant No.11531034)the Heilongjiang Postdoctoral Science Foundation(Grant No.LBH-Z06054)
文摘Aimed at the problems of infrared image recognition under varying illumination,face disguise,etc.,we bring out an infrared human face recognition algorithm based on 2DPCA.The proposed algorithm can work out the covariance matrix of the training sample easily and directly;at the same time,it costs less time to work out the eigenvector.Relevant experiments are carried out,and the result indicates that compared with the traditional recognition algorithm,the proposed recognition method is swift and has a good adaptability to the changes of human face posture.
基金National Natural Science Foundation of China(No.62006039)National Key Research and Development Program of China(No.2019YFE0190500)。
文摘Near infrared-visible(NIR-VIS)face recognition is to match an NIR face image to a VIS image.The main challenges of NIR-VIS face recognition are the gap caused by cross-modality and the lack of sufficient paired NIR-VIS face images to train models.This paper focuses on the generation of paired NIR-VIS face images and proposes a dual variational generator based on ResNeSt(RS-DVG).RS-DVG can generate a large number of paired NIR-VIS face images from noise,and these generated NIR-VIS face images can be used as the training set together with the real NIR-VIS face images.In addition,a triplet loss function is introduced and a novel triplet selection method is proposed specifically for the training of the current face recognition model,which maximizes the inter-class distance and minimizes the intra-class distance in the input face images.The method proposed in this paper was evaluated on the datasets CASIA NIR-VIS 2.0 and BUAA-VisNir,and relatively good results were obtained.
文摘In this work, we consider a homotopic principle for solving large-scale and dense l1underdetermined problems and its applications in image processing and classification. We solve the face recognition problem where the input image contains corrupted and/or lost pixels. The approach involves two steps: first, the incomplete or corrupted image is subject to an inpainting process, and secondly, the restored image is used to carry out the classification or recognition task. Addressing these two steps involves solving large scale l1minimization problems. To that end, we propose to solve a sequence of linear equality constrained multiquadric problems that depends on a regularization parameter that converges to zero. The procedure generates a central path that converges to a point on the solution set of the l1underdetermined problem. In order to solve each subproblem, a conjugate gradient algorithm is formulated. When noise is present in the model, inexact directions are taken so that an approximate solution is computed faster. This prevents the ill conditioning produced when the conjugate gradient is required to iterate until a zero residual is attained.
文摘A framework of real time face tracking and recognition is presented, which integrates skin color based tracking and PCA/BPNN (principle component analysis/back propagation neural network) hybrid recognition techniques. The algorithm is able to track the human face against a complex background and also works well when temporary occlusion occurs. We also obtain a very high recognition rate by averaging a number of samples over a long image sequence. The proposed approach has been successfully tested by many experiments, and can operate at 20 frames/s on an 800 MHz PC.
文摘In principal component analysis (PCA) algorithms for face recognition, to reduce the influence of the eigenvectors which relate to the changes of the illumination on abstract features, a modified PCA (MPCA) algorithm is proposed. The method is based on the idea of reducing the influence of the eigenvectors associated with the large eigenvalues by normalizing the feature vector element by its corresponding standard deviation. The Yale face database and Yale face database B are used to verify the method. The simulation results show that, for front face and even under the condition of limited variation in the facial poses, the proposed method results in better performance than the conventional PCA and linear discriminant analysis (LDA) approaches, and the computational cost remains the same as that of the PCA, and much less than that of the LDA.
文摘With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.
文摘Matrix principal component analysis (MatPCA), as an effective feature extraction method, can deal with the matrix pattern and the vector pattern. However, like PCA, MatPCA does not use the class information of samples. As a result, the extracted features cannot provide enough useful information for distinguishing pat- tern from one another, and further resulting in degradation of classification performance. To fullly use class in- formation of samples, a novel method, called the fuzzy within-class MatPCA (F-WMatPCA)is proposed. F-WMatPCA utilizes the fuzzy K-nearest neighbor method(FKNN) to fuzzify the class membership degrees of a training sample and then performs fuzzy MatPCA within these patterns having the same class label. Due to more class information is used in feature extraction, F-WMatPCA can intuitively improve the classification perfor- mance. Experimental results in face databases and some benchmark datasets show that F-WMatPCA is effective and competitive than MatPCA. The experimental analysis on face image databases indicates that F-WMatPCA im- proves the recognition accuracy and is more stable and robust in performing classification than the existing method of fuzzy-based F-Fisherfaces.
文摘Bagging is not quite suitable for stable classifiers such as nearest neighbor classifiers due to the lack of diversity and it is difficult to be directly applied to face recognition as well due to the small sample size (SSS) property of face recognition. To solve the two problems,local Bagging (L-Bagging) is proposed to simultaneously make Bagging apply to both nearest neighbor classifiers and face recognition. The major difference between L-Bagging and Bagging is that L-Bagging performs the bootstrap sampling on each local region partitioned from the original face image rather than the whole face image. Since the dimensionality of local region is usually far less than the number of samples and the component classifiers are constructed just in different local regions,L-Bagging deals with SSS problem and generates more diverse component classifiers. Experimental results on four standard face image databases (AR,Yale,ORL and Yale B) indicate that the proposed L-Bagging method is effective and robust to illumination,occlusion and slight pose variation.
基金The Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)the National Natural Science Foundation of China(No.61572258,61103141,51405241)+1 种基金the Natural Science Foundation of Jiangsu Province(No.BK20151530)Overseas Training Programs for Outstanding Young Scholars of Universities in Jiangsu Province
文摘To improve the classification performance of the kernel minimum squared error( KMSE), an enhanced KMSE algorithm( EKMSE) is proposed. It redefines the regular objective function by introducing a novel class label definition, and the relative class label matrix can be adaptively adjusted to the kernel matrix.Compared with the common methods, the newobjective function can enlarge the distance between different classes, which therefore yields better recognition rates. In addition, an iteration parameter searching technique is adopted to improve the computational efficiency. The extensive experiments on FERET and GT face databases illustrate the feasibility and efficiency of the proposed EKMSE. It outperforms the original MSE, KMSE,some KMSE improvement methods, and even the sparse representation-based techniques in face recognition, such as collaborate representation classification( CRC).