Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and ...Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and recognition performance of the system can be enhanced through judiciously leveraging the correlation among multimodal features.Nevertheless,two issues persist in multi-modal feature fusion recognition:Firstly,the enhancement of recognition performance in fusion recognition has not comprehensively considered the inter-modality correlations among distinct modalities.Secondly,during modal fusion,improper weight selection diminishes the salience of crucial modal features,thereby diminishing the overall recognition performance.To address these two issues,we introduce an enhanced DenseNet multimodal recognition network founded on feature-level fusion.The information from the three modalities is fused akin to RGB,and the input network augments the correlation between modes through channel correlation.Within the enhanced DenseNet network,the Efficient Channel Attention Network(ECA-Net)dynamically adjusts the weight of each channel to amplify the salience of crucial information in each modal feature.Depthwise separable convolution markedly reduces the training parameters and further enhances the feature correlation.Experimental evaluations were conducted on four multimodal databases,comprising six unimodal databases,including multispectral palmprint and palm vein databases from the Chinese Academy of Sciences.The Equal Error Rates(EER)values were 0.0149%,0.0150%,0.0099%,and 0.0050%,correspondingly.In comparison to other network methods for palmprint,palm vein,and finger vein fusion recognition,this approach substantially enhances recognition performance,rendering it suitable for high-security environments with practical applicability.The experiments in this article utilized amodest sample database comprising 200 individuals.The subsequent phase involves preparing for the extension of the method to larger databases.展开更多
The demand for a non-contact biometric approach for candidate identification has grown over the past ten years.Based on the most important biometric application,human gait analysis is a significant research topic in c...The demand for a non-contact biometric approach for candidate identification has grown over the past ten years.Based on the most important biometric application,human gait analysis is a significant research topic in computer vision.Researchers have paid a lot of attention to gait recognition,specifically the identification of people based on their walking patterns,due to its potential to correctly identify people far away.Gait recognition systems have been used in a variety of applications,including security,medical examinations,identity management,and access control.These systems require a complex combination of technical,operational,and definitional considerations.The employment of gait recognition techniques and technologies has produced a number of beneficial and well-liked applications.Thiswork proposes a novel deep learning-based framework for human gait classification in video sequences.This framework’smain challenge is improving the accuracy of accuracy gait classification under varying conditions,such as carrying a bag and changing clothes.The proposed method’s first step is selecting two pre-trained deep learningmodels and training fromscratch using deep transfer learning.Next,deepmodels have been trained using static hyperparameters;however,the learning rate is calculated using the particle swarmoptimization(PSO)algorithm.Then,the best features are selected from both trained models using the Harris Hawks controlled Sine-Cosine optimization algorithm.This algorithm chooses the best features,combined in a novel correlation-based fusion technique.Finally,the fused best features are categorized using medium,bi-layer,and tri-layered neural networks.On the publicly accessible dataset known as the CASIA-B dataset,the experimental process of the suggested technique was carried out,and an improved accuracy of 94.14% was achieved.The achieved accuracy of the proposed method is improved by the recent state-of-the-art techniques that show the significance of this work.展开更多
Human recognition technology based on biometrics has become a fundamental requirement in all aspects of life due to increased concerns about security and privacy issues.Therefore,biometric systems have emerged as a te...Human recognition technology based on biometrics has become a fundamental requirement in all aspects of life due to increased concerns about security and privacy issues.Therefore,biometric systems have emerged as a technology with the capability to identify or authenticate individuals based on their physiological and behavioral characteristics.Among different viable biometric modalities,the human ear structure can offer unique and valuable discriminative characteristics for human recognition systems.In recent years,most existing traditional ear recognition systems have been designed based on computer vision models and have achieved successful results.Nevertheless,such traditional models can be sensitive to several unconstrained environmental factors.As such,some traits may be difficult to extract automatically but can still be semantically perceived as soft biometrics.This research proposes a new group of semantic features to be used as soft ear biometrics,mainly inspired by conventional descriptive traits used naturally by humans when identifying or describing each other.Hence,the research study is focused on the fusion of the soft ear biometric traits with traditional(hard)ear biometric features to investigate their validity and efficacy in augmenting human identification performance.The proposed framework has two subsystems:first,a computer vision-based subsystem,extracting traditional(hard)ear biometric traits using principal component analysis(PCA)and local binary patterns(LBP),and second,a crowdsourcing-based subsystem,deriving semantic(soft)ear biometric traits.Several feature-level fusion experiments were conducted using the AMI database to evaluate the proposed algorithm’s performance.The obtained results for both identification and verification showed that the proposed soft ear biometric information significantly improved the recognition performance of traditional ear biometrics,reaching up to 12%for LBP and 5%for PCA descriptors;when fusing all three capacities PCA,LBP,and soft traits using k-nearest neighbors(KNN)classifier.展开更多
The rapid growth of smart technologies and services has intensified the challenges surrounding identity authenti-cation techniques.Biometric credentials are increasingly being used for verification due to their advant...The rapid growth of smart technologies and services has intensified the challenges surrounding identity authenti-cation techniques.Biometric credentials are increasingly being used for verification due to their advantages over traditional methods,making it crucial to safeguard the privacy of people’s biometric data in various scenarios.This paper offers an in-depth exploration for privacy-preserving techniques and potential threats to biometric systems.It proposes a noble and thorough taxonomy survey for privacy-preserving techniques,as well as a systematic framework for categorizing the field’s existing literature.We review the state-of-the-art methods and address their advantages and limitations in the context of various biometric modalities,such as face,fingerprint,and eye detection.The survey encompasses various categories of privacy-preserving mechanisms and examines the trade-offs between security,privacy,and recognition performance,as well as the issues and future research directions.It aims to provide researchers,professionals,and decision-makers with a thorough understanding of the existing privacy-preserving solutions in biometric recognition systems and serves as the foundation of the development of more secure and privacy-preserving biometric technologies.展开更多
Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi...Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi-modality images,the use of multi-modality images for fine-grained recognition has become a promising technology.Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples.The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features.The attention mechanism helps the model to pinpoint the key information in the image,resulting in a significant improvement in the model’s performance.In this paper,a dataset for fine-grained recognition of ships based on visible and near-infrared multi-modality remote sensing images has been proposed first,named Dataset for Multimodal Fine-grained Recognition of Ships(DMFGRS).It includes 1,635 pairs of visible and near-infrared remote sensing images divided into 20 categories,collated from digital orthophotos model provided by commercial remote sensing satellites.DMFGRS provides two types of annotation format files,as well as segmentation mask images corresponding to the ship targets.Then,a Multimodal Information Cross-Enhancement Network(MICE-Net)fusing features of visible and near-infrared remote sensing images,has been proposed.In the network,a dual-branch feature extraction and fusion module has been designed to obtain more expressive features.The Feature Cross Enhancement Module(FCEM)achieves the fusion enhancement of the two modal features by making the channel attention and spatial attention work cross-functionally on the feature map.A benchmark is established by evaluating state-of-the-art object recognition algorithms on DMFGRS.MICE-Net conducted experiments on DMFGRS,and the precision,recall,mAP0.5 and mAP0.5:0.95 reached 87%,77.1%,83.8%and 63.9%,respectively.Extensive experiments demonstrate that the proposed MICE-Net has more excellent performance on DMFGRS.Built on lightweight network YOLO,the model has excellent generalizability,and thus has good potential for application in real-life scenarios.展开更多
Dynamic signature is a biometric modality that recognizes an individual’s anatomic and behavioural characteristics when signing their name. The rampant case of signature falsification (Identity Theft) was the key mot...Dynamic signature is a biometric modality that recognizes an individual’s anatomic and behavioural characteristics when signing their name. The rampant case of signature falsification (Identity Theft) was the key motivating factor for embarking on this study. This study was necessitated by the damages and dangers posed by signature forgery coupled with the intractable nature of the problem. The aim and objectives of this study is to design a proactive and responsive system that could compare two signature samples and detect the correct signature against the forged one. Dynamic Signature verification is an important biometric technique that aims to detect whether a given signature is genuine or forged. In this research work, Convolutional Neural Networks (CNNsor ConvNet) which is a class of deep, feed forward artificial neural networks that has successfully been applied to analysing visual imagery was used to train the model. The signature images are stored in a file directory structure which the Keras Python library can work with. Then the CNN was implemented in python using the Keras with the TensorFlow backend to learn the patterns associated with the signature. The result showed that for the same CNNs-based network experimental result of average accuracy, the larger the training dataset, the higher the test accuracy. However, when the training dataset are insufficient, better results can be obtained. The paper concluded that by training datasets using CNNs network, 98% accuracy in the result was recorded, in the experimental part, the model achieved a high degree of accuracy in the classification of the biometric parameters used.展开更多
Gesture recognition plays an increasingly important role as the requirements of intelligent systems for human-computer interaction methods increase.To improve the accuracy of the millimeter-wave radar gesture detectio...Gesture recognition plays an increasingly important role as the requirements of intelligent systems for human-computer interaction methods increase.To improve the accuracy of the millimeter-wave radar gesture detection algorithm with limited computational resources,this study improves the detection performance in terms of optimized features and interference filtering.The accuracy of the algorithm is improved by refining the combination of gesture features using a self-constructed dataset,and biometric filtering is introduced to reduce the interference of inanimate object motion.Finally,experiments demonstrate the effectiveness of the proposed algorithm in both mitigating interference from inanimate objects and accurately recognizing gestures.Results show a notable 93.29%average reduction in false detections achieved through the integration of biometric filtering into the algorithm’s interpretation of target movements.Additionally,the algorithm adeptly identifies the six gestures with an average accuracy of 96.84%on embedded systems.展开更多
In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the e...In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.展开更多
In recent years,the demand for biometric-based human recog-nition methods has drastically increased to meet the privacy and security requirements.Palm prints,palm veins,finger veins,fingerprints,hand veins and other a...In recent years,the demand for biometric-based human recog-nition methods has drastically increased to meet the privacy and security requirements.Palm prints,palm veins,finger veins,fingerprints,hand veins and other anatomic and behavioral features are utilized in the development of different biometric recognition techniques.Amongst the available biometric recognition techniques,Finger Vein Recognition(FVR)is a general technique that analyzes the patterns of finger veins to authenticate the individuals.Deep Learning(DL)-based techniques have gained immense attention in the recent years,since it accomplishes excellent outcomes in various challenging domains such as computer vision,speech detection and Natural Language Processing(NLP).This technique is a natural fit to overcome the ever-increasing biomet-ric detection problems and cell phone authentication issues in airport security techniques.The current study presents an Automated Biometric Finger Vein Recognition using Evolutionary Algorithm with Deep Learning(ABFVR-EADL)model.The presented ABFVR-EADL model aims to accomplish bio-metric recognition using the patterns of the finger veins.Initially,the presented ABFVR-EADL model employs the histogram equalization technique to pre-process the input images.For feature extraction,the Salp Swarm Algorithm(SSA)with Densely-connected Networks(DenseNet-201)model is exploited,showing the proposed method’s novelty.Finally,the Deep-Stacked Denoising Autoencoder(DSAE)is utilized for biometric recognition.The proposed ABFVR-EADL method was experimentally validated using the benchmark databases,and the outcomes confirmed the productive performance of the proposed ABFVR-EADL model over other DL models.展开更多
The use of voice to perform biometric authentication is an importanttechnological development,because it is a non-invasive identification methodand does not require special hardware,so it is less likely to arouse user...The use of voice to perform biometric authentication is an importanttechnological development,because it is a non-invasive identification methodand does not require special hardware,so it is less likely to arouse user disgust.This study tries to apply the voice recognition technology to the speech-driveninteractive voice response questionnaire system aiming to upgrade the traditionalspeech system to an intelligent voice response questionnaire network so that thenew device may offer enterprises more precise data for customer relationshipmanagement(CRM).The intelligence-type voice response gadget is becominga new mobile channel at the current time,with functions of the questionnaireto be built in for the convenience of collecting information on local preferencesthat can be used for localized promotion and publicity.Authors of this study propose a framework using voice recognition and intelligent analysis models to identify target customers through voice messages gathered in the voice response questionnaire system;that is,transforming the traditional speech system to anintelligent voice complex.The speaker recognition system discussed hereemploys volume as the acoustic feature in endpoint detection as the computationload is usually low in this method.To correct two types of errors found in the endpoint detection practice because of ambient noise,this study suggests ways toimprove the situation.First,to reach high accuracy,this study follows a dynamictime warping(DTW)based method to gain speaker identification.Second,it isdevoted to avoiding any errors in endpoint detection by filtering noise from voicesignals before getting recognition and deleting any test utterances that might negatively affect the results of recognition.It is hoped that by so doing the recognitionrate is improved.According to the experimental results,the method proposed inthis research has a high recognition rate,whether it is on personal-level or industrial-level computers,and can reach the practical application standard.Therefore,the voice management system in this research can be regarded as Virtual customerservice staff to use.展开更多
Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become availa...Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.展开更多
Background Gesture recognition has attracted significant attention because of its wide range of potential applications.Although multi-modal gesture recognition has made significant progress in recent years,a popular m...Background Gesture recognition has attracted significant attention because of its wide range of potential applications.Although multi-modal gesture recognition has made significant progress in recent years,a popular method still is simply fusing prediction scores at the end of each branch,which often ignores complementary features among different modalities in the early stage and does not fuse the complementary features into a more discriminative feature.Methods This paper proposes an Adaptive Cross-modal Weighting(ACmW)scheme to exploit complementarity features from RGB-D data in this study.The scheme learns relations among different modalities by combining the features of different data streams.The proposed ACmW module contains two key functions:(1)fusing complementary features from multiple streams through an adaptive one-dimensional convolution;and(2)modeling the correlation of multi-stream complementary features in the time dimension.Through the effective combination of these two functional modules,the proposed ACmW can automatically analyze the relationship between the complementary features from different streams,and can fuse them in the spatial and temporal dimensions.Results Extensive experiments validate the effectiveness of the proposed method,and show that our method outperforms state-of-the-art methods on IsoGD and NVGesture.展开更多
Iris recognition,as a biometric method,outperforms others because of its high accuracy. Iris is the visible internal organ of human,so it is stable and very difficult to be altered. But if an eye surgery must be made ...Iris recognition,as a biometric method,outperforms others because of its high accuracy. Iris is the visible internal organ of human,so it is stable and very difficult to be altered. But if an eye surgery must be made to some individuals,it may be rejected by iris recognition system as imposters after the surgery,because the iris pattern was altered or damaged somewhat during surgery and cannot match the iris template stored before the surgery. In this paper,we originally discuss whether refractive surgery for vision correction(LASIK surgery) would influence the performance of iris recognition. And experiments are designed and tested on iris images captured especially for this research from patients before and after refractive surgery. Experiments showed that refractive surgery has little influence on iris recognition.展开更多
Human gait recognition(HGR)is the process of identifying a sub-ject(human)based on their walking pattern.Each subject is a unique walking pattern and cannot be simulated by other subjects.But,gait recognition is not e...Human gait recognition(HGR)is the process of identifying a sub-ject(human)based on their walking pattern.Each subject is a unique walking pattern and cannot be simulated by other subjects.But,gait recognition is not easy and makes the system difficult if any object is carried by a subject,such as a bag or coat.This article proposes an automated architecture based on deep features optimization for HGR.To our knowledge,it is the first architecture in which features are fused using multiset canonical correlation analysis(MCCA).In the proposed method,original video frames are processed for all 11 selected angles of the CASIA B dataset and utilized to train two fine-tuned deep learning models such as Squeezenet and Efficientnet.Deep transfer learning was used to train both fine-tuned models on selected angles,yielding two new targeted models that were later used for feature engineering.Features are extracted from the deep layer of both fine-tuned models and fused into one vector using MCCA.An improved manta ray foraging optimization algorithm is also proposed to select the best features from the fused feature matrix and classified using a narrow neural network classifier.The experimental process was conducted on all 11 angles of the large multi-view gait dataset(CASIA B)dataset and obtained improved accuracy than the state-of-the-art techniques.Moreover,a detailed confidence interval based analysis also shows the effectiveness of the proposed architecture for HGR.展开更多
Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recogni...Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recognition tasks separately as in unimodal systems,or jointly with two or more as in multimodal systems.However,multimodal systems can usually enhance the recognition performance over unimodal systems by integrating the biometric data of multiple modalities at different fusion levels.Despite this enhancement,in real-life applications some factors degrade multimodal systems’performance,such as occlusion,face poses,and noise in voice data.In this paper,we propose two algorithms that effectively apply dynamic fusion at feature level based on the data quality of multimodal biometrics.The proposed algorithms attempt to minimize the negative influence of confusing and low-quality features by either exclusion or weight reduction to achieve better recognition performance.The proposed dynamic fusion was achieved using face and voice biometrics,where face features were extracted using principal component analysis(PCA),and Gabor filters separately,whilst voice features were extracted using Mel-Frequency Cepstral Coefficients(MFCCs).Here,the facial data quality assessment of face images is mainly based on the existence of occlusion,whereas the assessment of voice data quality is substantially based on the calculation of signal to noise ratio(SNR)as per the existence of noise.To evaluate the performance of the proposed algorithms,several experiments were conducted using two combinations of three different databases,AR database,and the extended Yale Face Database B for face images,in addition to VOiCES database for voice data.The obtained results show that both proposed dynamic fusion algorithms attain improved performance and offer more advantages in identification and verification over not only the standard unimodal algorithms but also the multimodal algorithms using standard fusion methods.展开更多
Missing, swapping, false insurance claims and reallocation of pet animals (dog) are global problems throughout the world and research done to solve this problem is minimal. Traditional biometrics and non-biometrics me...Missing, swapping, false insurance claims and reallocation of pet animals (dog) are global problems throughout the world and research done to solve this problem is minimal. Traditional biometrics and non-biometrics methods have their own boundaries and they fail to provide competent level of security to pet animal (dog). The work on animal identification based on their phenotype appearance (coat patterns) has been an active research area in recent years and automatic face recognition for dog is not reported in the literature. Dog identification needs innovative research to protect the pet animal. Therefore it is imperative to initiate research, so that future face recognition algorithm will be able to solve this important problem for identification of pet animal (like dog, cat). In this paper an attempt has been made to minimize the above mentioned problems by biometrics face recognition of dog. The contributions of this research are: 1) implementation of an existing biometrics algorithm which mitigates the effects of covariates for dogs;2) proposed fusion based method for recognition of pet animal with 94.86% accuracy. Thus in this paper, we have tried to demonstrate that face recognition of dog can be used to recognize the dog efficiently.展开更多
Biometric verification has become essential to authenticate the individuals in public and private places.Among several biometrics,iris has peculiar features and its working mechanism is complex in nature.The recent de...Biometric verification has become essential to authenticate the individuals in public and private places.Among several biometrics,iris has peculiar features and its working mechanism is complex in nature.The recent developments in Machine Learning and Deep Learning approaches enable the development of effective iris recognition models.With this motivation,the current study introduces a novel Chaotic Krill Herd with Deep Transfer Learning Based Biometric Iris Recognition System(CKHDTL-BIRS).The presented CKHDTL-BIRS model intends to recognize and classify iris images as a part of biometric verification.To achieve this,CKHDTL-BIRS model initially performs Median Filtering(MF)-based preprocessing and segmentation for iris localization.In addition,MobileNetmodel is also utilized to generate a set of useful feature vectors.Moreover,Stacked Sparse Autoencoder(SSAE)approach is applied for classification.At last,CKH algorithm is exploited for optimization of the parameters involved in SSAE technique.The proposed CKHDTL-BIRS model was experimentally validated using benchmark dataset and the outcomes were examined under several aspects.The comparison study results established the enhanced performance of CKHDTL-BIRS technique over recent approaches.展开更多
This study describes the development of a simple biometric facial recognition system, BFMT, which is designed for use in identifying individuals within a given population. The system is based on digital signatures der...This study describes the development of a simple biometric facial recognition system, BFMT, which is designed for use in identifying individuals within a given population. The system is based on digital signatures derived from facial images of human subjects. The results of the study demonstrate that a particular set of facial features from a simple two-dimensional image can yield a unique digital signature which can be used to identify a subject from a limited population within a controlled environment. The simplicity of the model upon which the system is based can result in commercial facial recognition systems that are more cost-effective to develop than those currently on the market.展开更多
Human motion recognition is a research hotspot in the field of computer vision,which has a wide range of applications,including biometrics,intelligent surveillance and human-computer interaction.In visionbased human m...Human motion recognition is a research hotspot in the field of computer vision,which has a wide range of applications,including biometrics,intelligent surveillance and human-computer interaction.In visionbased human motion recognition,the main input modes are RGB,depth image and bone data.Each mode can capture some kind of information,which is likely to be complementary to other modes,for example,some modes capture global information while others capture local details of an action.Intuitively speaking,the fusion of multiple modal data can improve the recognition accuracy.In addition,how to correctly model and utilize spatiotemporal information is one of the challenges facing human motion recognition.Aiming at the feature extraction methods involved in human action recognition tasks in video,this paper summarizes the traditional manual feature extraction methods from the aspects of global feature extraction and local feature extraction,and introduces the commonly used feature learning models of feature extraction methods based on deep learning in detail.This paper summarizes the opportunities and challenges in the field of motion recognition and looks forward to the possible research directions in the future.展开更多
Biometric security systems based on facial characteristics face a challenging task due to variability in the intrapersonal facial appearance of subjects traced to factors such as pose, illumination, expression and agi...Biometric security systems based on facial characteristics face a challenging task due to variability in the intrapersonal facial appearance of subjects traced to factors such as pose, illumination, expression and aging. This paper innovates as it proposes a deep learning and set-based approach to face recognition subject to aging. The images for each subject taken at various times are treated as a single set, which is then compared to sets of images belonging to other subjects. Facial features are extracted using a convolutional neural network characteristic of deep learning. Our experimental results show that set-based recognition performs better than the singleton-based approach for both face identification and face verification. We also find that by using set-based recognition, it is easier to recognize older subjects from younger ones rather than younger subjects from older ones.展开更多
基金funded by the National Natural Science Foundation of China(61991413)the China Postdoctoral Science Foundation(2019M651142)+1 种基金the Natural Science Foundation of Liaoning Province(2021-KF-12-07)the Natural Science Foundations of Liaoning Province(2023-MS-322).
文摘Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and recognition performance of the system can be enhanced through judiciously leveraging the correlation among multimodal features.Nevertheless,two issues persist in multi-modal feature fusion recognition:Firstly,the enhancement of recognition performance in fusion recognition has not comprehensively considered the inter-modality correlations among distinct modalities.Secondly,during modal fusion,improper weight selection diminishes the salience of crucial modal features,thereby diminishing the overall recognition performance.To address these two issues,we introduce an enhanced DenseNet multimodal recognition network founded on feature-level fusion.The information from the three modalities is fused akin to RGB,and the input network augments the correlation between modes through channel correlation.Within the enhanced DenseNet network,the Efficient Channel Attention Network(ECA-Net)dynamically adjusts the weight of each channel to amplify the salience of crucial information in each modal feature.Depthwise separable convolution markedly reduces the training parameters and further enhances the feature correlation.Experimental evaluations were conducted on four multimodal databases,comprising six unimodal databases,including multispectral palmprint and palm vein databases from the Chinese Academy of Sciences.The Equal Error Rates(EER)values were 0.0149%,0.0150%,0.0099%,and 0.0050%,correspondingly.In comparison to other network methods for palmprint,palm vein,and finger vein fusion recognition,this approach substantially enhances recognition performance,rendering it suitable for high-security environments with practical applicability.The experiments in this article utilized amodest sample database comprising 200 individuals.The subsequent phase involves preparing for the extension of the method to larger databases.
基金supported by the“Human Resources Program in Energy Technol-ogy”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)and Granted Financial Resources from the Ministry of Trade,Industry,and Energy,Republic of Korea(No.20204010600090)The funding of this work was provided by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R410),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The demand for a non-contact biometric approach for candidate identification has grown over the past ten years.Based on the most important biometric application,human gait analysis is a significant research topic in computer vision.Researchers have paid a lot of attention to gait recognition,specifically the identification of people based on their walking patterns,due to its potential to correctly identify people far away.Gait recognition systems have been used in a variety of applications,including security,medical examinations,identity management,and access control.These systems require a complex combination of technical,operational,and definitional considerations.The employment of gait recognition techniques and technologies has produced a number of beneficial and well-liked applications.Thiswork proposes a novel deep learning-based framework for human gait classification in video sequences.This framework’smain challenge is improving the accuracy of accuracy gait classification under varying conditions,such as carrying a bag and changing clothes.The proposed method’s first step is selecting two pre-trained deep learningmodels and training fromscratch using deep transfer learning.Next,deepmodels have been trained using static hyperparameters;however,the learning rate is calculated using the particle swarmoptimization(PSO)algorithm.Then,the best features are selected from both trained models using the Harris Hawks controlled Sine-Cosine optimization algorithm.This algorithm chooses the best features,combined in a novel correlation-based fusion technique.Finally,the fused best features are categorized using medium,bi-layer,and tri-layered neural networks.On the publicly accessible dataset known as the CASIA-B dataset,the experimental process of the suggested technique was carried out,and an improved accuracy of 94.14% was achieved.The achieved accuracy of the proposed method is improved by the recent state-of-the-art techniques that show the significance of this work.
基金supported and funded by KAU Scientific Endowment,King Abdulaziz University,Jeddah,Saudi Arabia.
文摘Human recognition technology based on biometrics has become a fundamental requirement in all aspects of life due to increased concerns about security and privacy issues.Therefore,biometric systems have emerged as a technology with the capability to identify or authenticate individuals based on their physiological and behavioral characteristics.Among different viable biometric modalities,the human ear structure can offer unique and valuable discriminative characteristics for human recognition systems.In recent years,most existing traditional ear recognition systems have been designed based on computer vision models and have achieved successful results.Nevertheless,such traditional models can be sensitive to several unconstrained environmental factors.As such,some traits may be difficult to extract automatically but can still be semantically perceived as soft biometrics.This research proposes a new group of semantic features to be used as soft ear biometrics,mainly inspired by conventional descriptive traits used naturally by humans when identifying or describing each other.Hence,the research study is focused on the fusion of the soft ear biometric traits with traditional(hard)ear biometric features to investigate their validity and efficacy in augmenting human identification performance.The proposed framework has two subsystems:first,a computer vision-based subsystem,extracting traditional(hard)ear biometric traits using principal component analysis(PCA)and local binary patterns(LBP),and second,a crowdsourcing-based subsystem,deriving semantic(soft)ear biometric traits.Several feature-level fusion experiments were conducted using the AMI database to evaluate the proposed algorithm’s performance.The obtained results for both identification and verification showed that the proposed soft ear biometric information significantly improved the recognition performance of traditional ear biometrics,reaching up to 12%for LBP and 5%for PCA descriptors;when fusing all three capacities PCA,LBP,and soft traits using k-nearest neighbors(KNN)classifier.
基金The research is supported by Nature Science Foundation of Zhejiang Province(LQ20F020008)“Pioneer”and“Leading Goose”R&D Program of Zhejiang(Grant Nos.2023C03203,2023C01150).
文摘The rapid growth of smart technologies and services has intensified the challenges surrounding identity authenti-cation techniques.Biometric credentials are increasingly being used for verification due to their advantages over traditional methods,making it crucial to safeguard the privacy of people’s biometric data in various scenarios.This paper offers an in-depth exploration for privacy-preserving techniques and potential threats to biometric systems.It proposes a noble and thorough taxonomy survey for privacy-preserving techniques,as well as a systematic framework for categorizing the field’s existing literature.We review the state-of-the-art methods and address their advantages and limitations in the context of various biometric modalities,such as face,fingerprint,and eye detection.The survey encompasses various categories of privacy-preserving mechanisms and examines the trade-offs between security,privacy,and recognition performance,as well as the issues and future research directions.It aims to provide researchers,professionals,and decision-makers with a thorough understanding of the existing privacy-preserving solutions in biometric recognition systems and serves as the foundation of the development of more secure and privacy-preserving biometric technologies.
文摘Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi-modality images,the use of multi-modality images for fine-grained recognition has become a promising technology.Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples.The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features.The attention mechanism helps the model to pinpoint the key information in the image,resulting in a significant improvement in the model’s performance.In this paper,a dataset for fine-grained recognition of ships based on visible and near-infrared multi-modality remote sensing images has been proposed first,named Dataset for Multimodal Fine-grained Recognition of Ships(DMFGRS).It includes 1,635 pairs of visible and near-infrared remote sensing images divided into 20 categories,collated from digital orthophotos model provided by commercial remote sensing satellites.DMFGRS provides two types of annotation format files,as well as segmentation mask images corresponding to the ship targets.Then,a Multimodal Information Cross-Enhancement Network(MICE-Net)fusing features of visible and near-infrared remote sensing images,has been proposed.In the network,a dual-branch feature extraction and fusion module has been designed to obtain more expressive features.The Feature Cross Enhancement Module(FCEM)achieves the fusion enhancement of the two modal features by making the channel attention and spatial attention work cross-functionally on the feature map.A benchmark is established by evaluating state-of-the-art object recognition algorithms on DMFGRS.MICE-Net conducted experiments on DMFGRS,and the precision,recall,mAP0.5 and mAP0.5:0.95 reached 87%,77.1%,83.8%and 63.9%,respectively.Extensive experiments demonstrate that the proposed MICE-Net has more excellent performance on DMFGRS.Built on lightweight network YOLO,the model has excellent generalizability,and thus has good potential for application in real-life scenarios.
文摘Dynamic signature is a biometric modality that recognizes an individual’s anatomic and behavioural characteristics when signing their name. The rampant case of signature falsification (Identity Theft) was the key motivating factor for embarking on this study. This study was necessitated by the damages and dangers posed by signature forgery coupled with the intractable nature of the problem. The aim and objectives of this study is to design a proactive and responsive system that could compare two signature samples and detect the correct signature against the forged one. Dynamic Signature verification is an important biometric technique that aims to detect whether a given signature is genuine or forged. In this research work, Convolutional Neural Networks (CNNsor ConvNet) which is a class of deep, feed forward artificial neural networks that has successfully been applied to analysing visual imagery was used to train the model. The signature images are stored in a file directory structure which the Keras Python library can work with. Then the CNN was implemented in python using the Keras with the TensorFlow backend to learn the patterns associated with the signature. The result showed that for the same CNNs-based network experimental result of average accuracy, the larger the training dataset, the higher the test accuracy. However, when the training dataset are insufficient, better results can be obtained. The paper concluded that by training datasets using CNNs network, 98% accuracy in the result was recorded, in the experimental part, the model achieved a high degree of accuracy in the classification of the biometric parameters used.
基金supported by the National Natural Science Foundation of China(No.12172076)。
文摘Gesture recognition plays an increasingly important role as the requirements of intelligent systems for human-computer interaction methods increase.To improve the accuracy of the millimeter-wave radar gesture detection algorithm with limited computational resources,this study improves the detection performance in terms of optimized features and interference filtering.The accuracy of the algorithm is improved by refining the combination of gesture features using a self-constructed dataset,and biometric filtering is introduced to reduce the interference of inanimate object motion.Finally,experiments demonstrate the effectiveness of the proposed algorithm in both mitigating interference from inanimate objects and accurately recognizing gestures.Results show a notable 93.29%average reduction in false detections achieved through the integration of biometric filtering into the algorithm’s interpretation of target movements.Additionally,the algorithm adeptly identifies the six gestures with an average accuracy of 96.84%on embedded systems.
文摘In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.
基金The Deanship of Scientific Research(DSR)at King Abdulaziz University(KAU),Jeddah,Saudi Arabia has funded this project,under Grant No.KEP-3-120-42.
文摘In recent years,the demand for biometric-based human recog-nition methods has drastically increased to meet the privacy and security requirements.Palm prints,palm veins,finger veins,fingerprints,hand veins and other anatomic and behavioral features are utilized in the development of different biometric recognition techniques.Amongst the available biometric recognition techniques,Finger Vein Recognition(FVR)is a general technique that analyzes the patterns of finger veins to authenticate the individuals.Deep Learning(DL)-based techniques have gained immense attention in the recent years,since it accomplishes excellent outcomes in various challenging domains such as computer vision,speech detection and Natural Language Processing(NLP).This technique is a natural fit to overcome the ever-increasing biomet-ric detection problems and cell phone authentication issues in airport security techniques.The current study presents an Automated Biometric Finger Vein Recognition using Evolutionary Algorithm with Deep Learning(ABFVR-EADL)model.The presented ABFVR-EADL model aims to accomplish bio-metric recognition using the patterns of the finger veins.Initially,the presented ABFVR-EADL model employs the histogram equalization technique to pre-process the input images.For feature extraction,the Salp Swarm Algorithm(SSA)with Densely-connected Networks(DenseNet-201)model is exploited,showing the proposed method’s novelty.Finally,the Deep-Stacked Denoising Autoencoder(DSAE)is utilized for biometric recognition.The proposed ABFVR-EADL method was experimentally validated using the benchmark databases,and the outcomes confirmed the productive performance of the proposed ABFVR-EADL model over other DL models.
文摘The use of voice to perform biometric authentication is an importanttechnological development,because it is a non-invasive identification methodand does not require special hardware,so it is less likely to arouse user disgust.This study tries to apply the voice recognition technology to the speech-driveninteractive voice response questionnaire system aiming to upgrade the traditionalspeech system to an intelligent voice response questionnaire network so that thenew device may offer enterprises more precise data for customer relationshipmanagement(CRM).The intelligence-type voice response gadget is becominga new mobile channel at the current time,with functions of the questionnaireto be built in for the convenience of collecting information on local preferencesthat can be used for localized promotion and publicity.Authors of this study propose a framework using voice recognition and intelligent analysis models to identify target customers through voice messages gathered in the voice response questionnaire system;that is,transforming the traditional speech system to anintelligent voice complex.The speaker recognition system discussed hereemploys volume as the acoustic feature in endpoint detection as the computationload is usually low in this method.To correct two types of errors found in the endpoint detection practice because of ambient noise,this study suggests ways toimprove the situation.First,to reach high accuracy,this study follows a dynamictime warping(DTW)based method to gain speaker identification.Second,it isdevoted to avoiding any errors in endpoint detection by filtering noise from voicesignals before getting recognition and deleting any test utterances that might negatively affect the results of recognition.It is hoped that by so doing the recognitionrate is improved.According to the experimental results,the method proposed inthis research has a high recognition rate,whether it is on personal-level or industrial-level computers,and can reach the practical application standard.Therefore,the voice management system in this research can be regarded as Virtual customerservice staff to use.
基金Supported by Grant-in-Aid for Young Scientists(A)(Grant No.26700021)Japan Society for the Promotion of Science and Strategic Information and Communications R&D Promotion Programme(Grant No.142103011)Ministry of Internal Affairs and Communications
文摘Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.
基金the Chinese National Natural Science Foundation Projects(61961160704,61876179)the Key Project of the General Logistics Department(ASW17C001)the Science and Technology Development Fund of Macao(0010/2019/AFJ,0025/2019/AKP).
文摘Background Gesture recognition has attracted significant attention because of its wide range of potential applications.Although multi-modal gesture recognition has made significant progress in recent years,a popular method still is simply fusing prediction scores at the end of each branch,which often ignores complementary features among different modalities in the early stage and does not fuse the complementary features into a more discriminative feature.Methods This paper proposes an Adaptive Cross-modal Weighting(ACmW)scheme to exploit complementarity features from RGB-D data in this study.The scheme learns relations among different modalities by combining the features of different data streams.The proposed ACmW module contains two key functions:(1)fusing complementary features from multiple streams through an adaptive one-dimensional convolution;and(2)modeling the correlation of multi-stream complementary features in the time dimension.Through the effective combination of these two functional modules,the proposed ACmW can automatically analyze the relationship between the complementary features from different streams,and can fuse them in the spatial and temporal dimensions.Results Extensive experiments validate the effectiveness of the proposed method,and show that our method outperforms state-of-the-art methods on IsoGD and NVGesture.
基金Project supported by the National Natural Science Foundation of China (No. 60427002)the National Hi-Tech Research andDevelopment Program (863) of China (No. 2006AA01Z119)
文摘Iris recognition,as a biometric method,outperforms others because of its high accuracy. Iris is the visible internal organ of human,so it is stable and very difficult to be altered. But if an eye surgery must be made to some individuals,it may be rejected by iris recognition system as imposters after the surgery,because the iris pattern was altered or damaged somewhat during surgery and cannot match the iris template stored before the surgery. In this paper,we originally discuss whether refractive surgery for vision correction(LASIK surgery) would influence the performance of iris recognition. And experiments are designed and tested on iris images captured especially for this research from patients before and after refractive surgery. Experiments showed that refractive surgery has little influence on iris recognition.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ICAN(ICT Challenge and Advanced Network of HRD)program(IITP-2022-2020-0-01832)supervised by the IITP(Institute of Information&Communications Technology Planning&Evaluation)and the Soonchunhyang University Research Fund.
文摘Human gait recognition(HGR)is the process of identifying a sub-ject(human)based on their walking pattern.Each subject is a unique walking pattern and cannot be simulated by other subjects.But,gait recognition is not easy and makes the system difficult if any object is carried by a subject,such as a bag or coat.This article proposes an automated architecture based on deep features optimization for HGR.To our knowledge,it is the first architecture in which features are fused using multiset canonical correlation analysis(MCCA).In the proposed method,original video frames are processed for all 11 selected angles of the CASIA B dataset and utilized to train two fine-tuned deep learning models such as Squeezenet and Efficientnet.Deep transfer learning was used to train both fine-tuned models on selected angles,yielding two new targeted models that were later used for feature engineering.Features are extracted from the deep layer of both fine-tuned models and fused into one vector using MCCA.An improved manta ray foraging optimization algorithm is also proposed to select the best features from the fused feature matrix and classified using a narrow neural network classifier.The experimental process was conducted on all 11 angles of the large multi-view gait dataset(CASIA B)dataset and obtained improved accuracy than the state-of-the-art techniques.Moreover,a detailed confidence interval based analysis also shows the effectiveness of the proposed architecture for HGR.
文摘Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recognition tasks separately as in unimodal systems,or jointly with two or more as in multimodal systems.However,multimodal systems can usually enhance the recognition performance over unimodal systems by integrating the biometric data of multiple modalities at different fusion levels.Despite this enhancement,in real-life applications some factors degrade multimodal systems’performance,such as occlusion,face poses,and noise in voice data.In this paper,we propose two algorithms that effectively apply dynamic fusion at feature level based on the data quality of multimodal biometrics.The proposed algorithms attempt to minimize the negative influence of confusing and low-quality features by either exclusion or weight reduction to achieve better recognition performance.The proposed dynamic fusion was achieved using face and voice biometrics,where face features were extracted using principal component analysis(PCA),and Gabor filters separately,whilst voice features were extracted using Mel-Frequency Cepstral Coefficients(MFCCs).Here,the facial data quality assessment of face images is mainly based on the existence of occlusion,whereas the assessment of voice data quality is substantially based on the calculation of signal to noise ratio(SNR)as per the existence of noise.To evaluate the performance of the proposed algorithms,several experiments were conducted using two combinations of three different databases,AR database,and the extended Yale Face Database B for face images,in addition to VOiCES database for voice data.The obtained results show that both proposed dynamic fusion algorithms attain improved performance and offer more advantages in identification and verification over not only the standard unimodal algorithms but also the multimodal algorithms using standard fusion methods.
文摘Missing, swapping, false insurance claims and reallocation of pet animals (dog) are global problems throughout the world and research done to solve this problem is minimal. Traditional biometrics and non-biometrics methods have their own boundaries and they fail to provide competent level of security to pet animal (dog). The work on animal identification based on their phenotype appearance (coat patterns) has been an active research area in recent years and automatic face recognition for dog is not reported in the literature. Dog identification needs innovative research to protect the pet animal. Therefore it is imperative to initiate research, so that future face recognition algorithm will be able to solve this important problem for identification of pet animal (like dog, cat). In this paper an attempt has been made to minimize the above mentioned problems by biometrics face recognition of dog. The contributions of this research are: 1) implementation of an existing biometrics algorithm which mitigates the effects of covariates for dogs;2) proposed fusion based method for recognition of pet animal with 94.86% accuracy. Thus in this paper, we have tried to demonstrate that face recognition of dog can be used to recognize the dog efficiently.
文摘Biometric verification has become essential to authenticate the individuals in public and private places.Among several biometrics,iris has peculiar features and its working mechanism is complex in nature.The recent developments in Machine Learning and Deep Learning approaches enable the development of effective iris recognition models.With this motivation,the current study introduces a novel Chaotic Krill Herd with Deep Transfer Learning Based Biometric Iris Recognition System(CKHDTL-BIRS).The presented CKHDTL-BIRS model intends to recognize and classify iris images as a part of biometric verification.To achieve this,CKHDTL-BIRS model initially performs Median Filtering(MF)-based preprocessing and segmentation for iris localization.In addition,MobileNetmodel is also utilized to generate a set of useful feature vectors.Moreover,Stacked Sparse Autoencoder(SSAE)approach is applied for classification.At last,CKH algorithm is exploited for optimization of the parameters involved in SSAE technique.The proposed CKHDTL-BIRS model was experimentally validated using benchmark dataset and the outcomes were examined under several aspects.The comparison study results established the enhanced performance of CKHDTL-BIRS technique over recent approaches.
文摘This study describes the development of a simple biometric facial recognition system, BFMT, which is designed for use in identifying individuals within a given population. The system is based on digital signatures derived from facial images of human subjects. The results of the study demonstrate that a particular set of facial features from a simple two-dimensional image can yield a unique digital signature which can be used to identify a subject from a limited population within a controlled environment. The simplicity of the model upon which the system is based can result in commercial facial recognition systems that are more cost-effective to develop than those currently on the market.
基金2021 Scientific research funding project of Liaoning Provincial Education Department(Research and implementation of university scientific research information platform serving the transformation of achievements).
文摘Human motion recognition is a research hotspot in the field of computer vision,which has a wide range of applications,including biometrics,intelligent surveillance and human-computer interaction.In visionbased human motion recognition,the main input modes are RGB,depth image and bone data.Each mode can capture some kind of information,which is likely to be complementary to other modes,for example,some modes capture global information while others capture local details of an action.Intuitively speaking,the fusion of multiple modal data can improve the recognition accuracy.In addition,how to correctly model and utilize spatiotemporal information is one of the challenges facing human motion recognition.Aiming at the feature extraction methods involved in human action recognition tasks in video,this paper summarizes the traditional manual feature extraction methods from the aspects of global feature extraction and local feature extraction,and introduces the commonly used feature learning models of feature extraction methods based on deep learning in detail.This paper summarizes the opportunities and challenges in the field of motion recognition and looks forward to the possible research directions in the future.
文摘Biometric security systems based on facial characteristics face a challenging task due to variability in the intrapersonal facial appearance of subjects traced to factors such as pose, illumination, expression and aging. This paper innovates as it proposes a deep learning and set-based approach to face recognition subject to aging. The images for each subject taken at various times are treated as a single set, which is then compared to sets of images belonging to other subjects. Facial features are extracted using a convolutional neural network characteristic of deep learning. Our experimental results show that set-based recognition performs better than the singleton-based approach for both face identification and face verification. We also find that by using set-based recognition, it is easier to recognize older subjects from younger ones rather than younger subjects from older ones.