In the digital age,non-touch communication technologies are reshaping human-device interactions and raising security concerns.A major challenge in current technology is the misinterpretation of gestures by sensors and...In the digital age,non-touch communication technologies are reshaping human-device interactions and raising security concerns.A major challenge in current technology is the misinterpretation of gestures by sensors and cameras,often caused by environmental factors.This issue has spurred the need for advanced data processing methods to achieve more accurate gesture recognition and predictions.Our study presents a novel virtual keyboard allowing character input via distinct hand gestures,focusing on two key aspects:hand gesture recognition and character input mechanisms.We developed a novel model with LSTM and fully connected layers for enhanced sequential data processing and hand gesture recognition.We also integrated CNN,max-pooling,and dropout layers for improved spatial feature extraction.This model architecture processes both temporal and spatial aspects of hand gestures,using LSTM to extract complex patterns from frame sequences for a comprehensive understanding of input data.Our unique dataset,essential for training the model,includes 1,662 landmarks from dynamic hand gestures,33 postures,and 468 face landmarks,all captured in real-time using advanced pose estimation.The model demonstrated high accuracy,achieving 98.52%in hand gesture recognition and over 97%in character input across different scenarios.Its excellent performance in real-time testing underlines its practicality and effectiveness,marking a significant advancement in enhancing human-device interactions in the digital age.展开更多
Machine learning is a technique for analyzing data that aids the construction of mathematical models.Because of the growth of the Internet of Things(IoT)and wearable sensor devices,gesture interfaces are becoming a mo...Machine learning is a technique for analyzing data that aids the construction of mathematical models.Because of the growth of the Internet of Things(IoT)and wearable sensor devices,gesture interfaces are becoming a more natural and expedient human-machine interaction method.This type of artificial intelligence that requires minimal or no direct human intervention in decision-making is predicated on the ability of intelligent systems to self-train and detect patterns.The rise of touch-free applications and the number of deaf people have increased the significance of hand gesture recognition.Potential applications of hand gesture recognition research span from online gaming to surgical robotics.The location of the hands,the alignment of the fingers,and the hand-to-body posture are the fundamental components of hierarchical emotions in gestures.Linguistic gestures may be difficult to distinguish from nonsensical motions in the field of gesture recognition.Linguistic gestures may be difficult to distinguish from nonsensical motions in the field of gesture recognition.In this scenario,it may be difficult to overcome segmentation uncertainty caused by accidental hand motions or trembling.When a user performs the same dynamic gesture,the hand shapes and speeds of each user,as well as those often generated by the same user,vary.A machine-learning-based Gesture Recognition Framework(ML-GRF)for recognizing the beginning and end of a gesture sequence in a continuous stream of data is suggested to solve the problem of distinguishing between meaningful dynamic gestures and scattered generation.We have recommended using a similarity matching-based gesture classification approach to reduce the overall computing cost associated with identifying actions,and we have shown how an efficient feature extraction method can be used to reduce the thousands of single gesture information to four binary digit gesture codes.The findings from the simulation support the accuracy,precision,gesture recognition,sensitivity,and efficiency rates.The Machine Learning-based Gesture Recognition Framework(ML-GRF)had an accuracy rate of 98.97%,a precision rate of 97.65%,a gesture recognition rate of 98.04%,a sensitivity rate of 96.99%,and an efficiency rate of 95.12%.展开更多
In this article,to reduce the complexity and improve the generalization ability of current gesture recognition systems,we propose a novel SE-CNN attention architecture for sEMG-based hand gesture recognition.The propo...In this article,to reduce the complexity and improve the generalization ability of current gesture recognition systems,we propose a novel SE-CNN attention architecture for sEMG-based hand gesture recognition.The proposed algorithm introduces a temporal squeeze-and-excite block into a simple CNN architecture and then utilizes it to recalibrate the weights of the feature outputs from the convolutional layer.By enhancing important features while suppressing useless ones,the model realizes gesture recognition efficiently.The last procedure of the proposed algorithm is utilizing a simple attention mechanism to enhance the learned representations of sEMG signals to performmulti-channel sEMG-based gesture recognition tasks.To evaluate the effectiveness and accuracy of the proposed algorithm,we conduct experiments involving multi-gesture datasets Ninapro DB4 and Ninapro DB5 for both inter-session validation and subject-wise cross-validation.After a series of comparisons with the previous models,the proposed algorithm effectively increases the robustness with improved gesture recognition performance and generalization ability.展开更多
Appearance-based dynamic Hand Gesture Recognition(HGR)remains a prominent area of research in Human-Computer Interaction(HCI).Numerous environmental and computational constraints limit its real-time deployment.In addi...Appearance-based dynamic Hand Gesture Recognition(HGR)remains a prominent area of research in Human-Computer Interaction(HCI).Numerous environmental and computational constraints limit its real-time deployment.In addition,the performance of a model decreases as the subject’s distance from the camera increases.This study proposes a 3D separable Convolutional Neural Network(CNN),considering the model’s computa-tional complexity and recognition accuracy.The 20BN-Jester dataset was used to train the model for six gesture classes.After achieving the best offline recognition accuracy of 94.39%,the model was deployed in real-time while considering the subject’s attention,the instant of performing a gesture,and the subject’s distance from the camera.Despite being discussed in numerous research articles,the distance factor remains unresolved in real-time deployment,which leads to degraded recognition results.In the proposed approach,the distance calculation substantially improves the classification performance by reducing the impact of the subject’s distance from the camera.Additionally,the capability of feature extraction,degree of relevance,and statistical significance of the proposed model against other state-of-the-art models were validated using t-distributed Stochastic Neighbor Embedding(t-SNE),Mathew’s Correlation Coefficient(MCC),and the McNemar test,respectively.We observed that the proposed model exhibits state-of-the-art outcomes and a comparatively high significance level.展开更多
The surface electromyography(sEMG)is one of the basic processing techniques to the gesture recognition because of its inherent advantages of easy collection and non-invasion.However,limited by feature extraction and c...The surface electromyography(sEMG)is one of the basic processing techniques to the gesture recognition because of its inherent advantages of easy collection and non-invasion.However,limited by feature extraction and classifier selection,the adaptability and accuracy of the conventional machine learning still need to promote with the increase of the input dimension and the number of output classifications.Moreover,due to the different characteristics of sEMG data and image data,the conventional convolutional neural network(CNN)have yet to fit sEMG signals.In this paper,a novel hybrid model combining CNN with the graph convolutional network(GCN)was constructed to improve the performance of the gesture recognition.Based on the characteristics of sEMG signal,GCN was introduced into the model through a joint voting network to extract the muscle synergy feature of the sEMG signal.Such strategy optimizes the structure and convolution kernel parameters of the residual network(ResNet)with the classification accuracy on the NinaPro DBl up to 90.07%.The experimental results and comparisons confirm the superiority of the proposed hybrid model for gesture recognition from the sEMG signals.展开更多
Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually us...Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually use convolutional neural networks to extract features directly from raw gesture data for gesture recognition,but the networks are affected by much interference information in the input data and thus fit to some unimportant features.In this paper,we proposed a novel method for encoding spatio-temporal information,which can enhance the key features required for gesture recognition,such as shape,structure,contour,position and hand motion of gestures,thereby improving the accuracy of gesture recognition.This encoding method can encode arbitrarily multiple frames of gesture data into a single frame of the spatio-temporal feature map and use the spatio-temporal feature map as the input to the neural network.This can guide the model to fit important features while avoiding the use of complex recurrent network structures to extract temporal features.In addition,we designed two sub-networks and trained the model using a sub-network pre-training strategy that trains the sub-networks first and then the entire network,so as to avoid the subnetworks focusing too much on the information of a single category feature and being overly influenced by each other’s features.Experimental results on two public gesture datasets show that the proposed spatio-temporal information encoding method achieves advanced accuracy.展开更多
Sign language recognition can be treated as one of the efficient solu-tions for disabled people to communicate with others.It helps them to convey the required data by the use of sign language with no issues.The lates...Sign language recognition can be treated as one of the efficient solu-tions for disabled people to communicate with others.It helps them to convey the required data by the use of sign language with no issues.The latest develop-ments in computer vision and image processing techniques can be accurately uti-lized for the sign recognition process by disabled people.American Sign Language(ASL)detection was challenging because of the enhancing intraclass similarity and higher complexity.This article develops a new Bayesian Optimiza-tion with Deep Learning-Driven Hand Gesture Recognition Based Sign Language Communication(BODL-HGRSLC)for Disabled People.The BODL-HGRSLC technique aims to recognize the hand gestures for disabled people’s communica-tion.The presented BODL-HGRSLC technique integrates the concepts of compu-ter vision(CV)and DL models.In the presented BODL-HGRSLC technique,a deep convolutional neural network-based residual network(ResNet)model is applied for feature extraction.Besides,the presented BODL-HGRSLC model uses Bayesian optimization for the hyperparameter tuning process.At last,a bidir-ectional gated recurrent unit(BiGRU)model is exploited for the HGR procedure.A wide range of experiments was conducted to demonstrate the enhanced perfor-mance of the presented BODL-HGRSLC model.The comprehensive comparison study reported the improvements of the BODL-HGRSLC model over other DL models with maximum accuracy of 99.75%.展开更多
Hand Gesture Recognition(HGR)is a promising research area with an extensive range of applications,such as surgery,video game techniques,and sign language translation,where sign language is a complicated structured for...Hand Gesture Recognition(HGR)is a promising research area with an extensive range of applications,such as surgery,video game techniques,and sign language translation,where sign language is a complicated structured form of hand gestures.The fundamental building blocks of structured expressions in sign language are the arrangement of the fingers,the orientation of the hand,and the hand’s position concerning the body.The importance of HGR has increased due to the increasing number of touchless applications and the rapid growth of the hearing-impaired population.Therefore,real-time HGR is one of the most effective interaction methods between computers and humans.Developing a user-free interface with good recognition performance should be the goal of real-time HGR systems.Nowadays,Convolutional Neural Network(CNN)shows great recognition rates for different image-level classification tasks.It is challenging to train deep CNN networks like VGG-16,VGG-19,Inception-v3,and Efficientnet-B0 from scratch because only some significant labeled image datasets are available for static hand gesture images.However,an efficient and robust hand gesture recognition system of sign language employing finetuned Inception-v3 and Efficientnet-Bo network is proposed to identify hand gestures using a comparative small HGR dataset.Experiments show that Inception-v3 achieved 90%accuracy and 0.93%precision,0.91%recall,and 0.90%f1-score,respectively,while EfficientNet-B0 achieved 99%accuracy and 0.98%,0.97%,0.98%,precision,recall,and f1-score respectively.展开更多
Hand gesture recognition (HGR) is used in a numerous applications,including medical health-care, industrial purpose and sports detection.We have developed a real-time hand gesture recognition system using inertialsens...Hand gesture recognition (HGR) is used in a numerous applications,including medical health-care, industrial purpose and sports detection.We have developed a real-time hand gesture recognition system using inertialsensors for the smart home application. Developing such a model facilitatesthe medical health field (elders or disabled ones). Home automation has alsobeen proven to be a tremendous benefit for the elderly and disabled. Residentsare admitted to smart homes for comfort, luxury, improved quality of life,and protection against intrusion and burglars. This paper proposes a novelsystem that uses principal component analysis, linear discrimination analysisfeature extraction, and random forest as a classifier to improveHGRaccuracy.We have achieved an accuracy of 94% over the publicly benchmarked HGRdataset. The proposed system can be used to detect hand gestures in thehealthcare industry as well as in the industrial and educational sectors.展开更多
Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning netwo...Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning network for hand gesture recognition.The network integrates several well-proved modules together to learn both short-term and long-term features from video inputs and meanwhile avoid intensive computation.To learn short-term features,each video input is segmented into a fixed number of frame groups.A frame is randomly selected from each group and represented as an RGB image as well as an optical flow snapshot.These two entities are fused and fed into a convolutional neural network(Conv Net)for feature extraction.The Conv Nets for all groups share parameters.To learn longterm features,outputs from all Conv Nets are fed into a long short-term memory(LSTM)network,by which a final classification result is predicted.The new model has been tested with two popular hand gesture datasets,namely the Jester dataset and Nvidia dataset.Comparing with other models,our model produced very competitive results.The robustness of the new model has also been proved with an augmented dataset with enhanced diversity of hand gestures.展开更多
Hand gesture recognition is a popular topic in computer vision and makes human-computer interaction more flexible and convenient.The representation of hand gestures is critical for recognition.In this paper,we propose...Hand gesture recognition is a popular topic in computer vision and makes human-computer interaction more flexible and convenient.The representation of hand gestures is critical for recognition.In this paper,we propose a new method to measure the similarity between hand gestures and exploit it for hand gesture recognition.The depth maps of hand gestures captured via the Kinect sensors are used in our method,where the 3D hand shapes can be segmented from the cluttered backgrounds.To extract the pattern of salient 3D shape features,we propose a new descriptor-3D Shape Context,for 3D hand gesture representation.The 3D Shape Context information of each 3D point is obtained in multiple scales because both local shape context and global shape distribution are necessary for recognition.The description of all the 3D points constructs the hand gesture representation,and hand gesture recognition is explored via dynamic time warping algorithm.Extensive experiments are conducted on multiple benchmark datasets.The experimental results verify that the proposed method is robust to noise,articulated variations,and rigid transformations.Our method outperforms state-of-the-art methods in the comparisons of accuracy and efficiency.展开更多
In this study,we developed a system based on deep space–time neural networks for gesture recognition.When users change or the number of gesture categories increases,the accuracy of gesture recognition decreases consi...In this study,we developed a system based on deep space–time neural networks for gesture recognition.When users change or the number of gesture categories increases,the accuracy of gesture recognition decreases considerably because most gesture recognition systems cannot accommodate both user differentiation and gesture diversity.To overcome the limitations of existing methods,we designed a onedimensional parallel long short-term memory–fully convolutional network(LSTM–FCN)model to extract gesture features of different dimensions.LSTM can learn complex time dynamic information,whereas FCN can predict gestures efficiently by extracting the deep,abstract features of gestures in the spatial dimension.In the experiment,50 types of gestures of five users were collected and evaluated.The experimental results demonstrate the effectiveness of this system and robustness to various gestures and individual changes.Statistical analysis of the recognition results indicated that an average accuracy of approximately 98.9% was achieved.展开更多
Device-free gesture recognition is an emerging wireless sensing technique which could recognize gestures by analyzing its influence on surrounding wireless signals,it may empower wireless networks with the augmented s...Device-free gesture recognition is an emerging wireless sensing technique which could recognize gestures by analyzing its influence on surrounding wireless signals,it may empower wireless networks with the augmented sensing ability.Researchers have made great achievements for singleperson device-free gesture recognition.However,when multiple persons conduct gestures simultaneously,the received signals will be mixed together,and thus traditional methods would not work well anymore.Moreover,the anonymity of persons and the change in the surrounding environment would cause feature shift and mismatch,and thus the recognition accuracy would degrade remarkably.To address these problems,we explore and exploit the diversity of spatial information and propose a multidimensional analysis method to separate the gesture feature of each person using a focusing sensing strategy.Meanwhile,we also present a deep-learning based robust device free gesture recognition framework,which leverages an adversarial approach to extract robust gesture feature that is insensitive to the change of persons and environment.Furthermore,we also develop a 77GHz mmWave prototype system and evaluate the proposed methods extensively.Experimental results reveal that the proposed system can achieve average accuracies of 93%and 84%when 10 gestures are conducted in Received:Jun.18,2020 Revised:Aug.06,2020 Editor:Ning Ge different environments by two and four persons simultaneously,respectively.展开更多
Aiming at the diversity of hand gesture traces by different people,the article presents novel method called cluster dynamic time warping( CDTW),which is based on the main axis classification and sample clustering of i...Aiming at the diversity of hand gesture traces by different people,the article presents novel method called cluster dynamic time warping( CDTW),which is based on the main axis classification and sample clustering of individuals. This method shows good performance on reducing the complexity of recognition and strong robustness of individuals. Data acquisition is implemented on a triaxial accelerometer with 100 Hz sampling frequency. A database of 2400 traces was created by ten subjects for the system testing and evaluation. The overall accuracy was found to be 98. 84% for user independent gesture recognition and 96. 7% for user dependent gesture recognition,higher than dynamic time warping( DTW),derivative DTW( DDTW) and piecewise DTW( PDTW) methods.Computation cost of CDTW in this project has been reduced 11 520 times compared with DTW.展开更多
This paper introduces a human gesture recognition algorithm using an impulse radio ultra-wide- band (IR-UWB) radar sensor. Human gesture recognition has been one of the hottest research topics for quite a long time. M...This paper introduces a human gesture recognition algorithm using an impulse radio ultra-wide- band (IR-UWB) radar sensor. Human gesture recognition has been one of the hottest research topics for quite a long time. Many gesture recognition algorithms or systems using other sensors have been proposed such as using cameras, RFID tags and so on. Among which gesture recognition systems using cameras have been extensively studied in past years and widely used in practical. While it might show some deficiencies in some cases. For example, the users might not like to be filmed by cameras considering their privacies. Besides, it might not work well in very dark environments. While RFID tags could be inconvenient to many people and are likely to be lost. Our gesture recognition algorithm uses IR-UWB radar sensor which has pretty high resolution in ranging and adjustable gesture recognition range, meanwhile, does not have problems in privacy issues or darkness. In this paper, the gesture recognition algorithm is based on the moving direction and distance change of the human hand and the change of the frontal surface area of hand towards radar sensor. By combining these changes while doing gestures, the algorithm may recognize basically 6 kinds of hand gestures. The experimental results show that these gestures are of quite good performance. The performance analysis from experiments is also given.展开更多
A hand gesture recognition method is presented for human-computer interaction,which is based on fingertip localization. First,hand gesture is segmented from the background based on skin color characteristics. Second,f...A hand gesture recognition method is presented for human-computer interaction,which is based on fingertip localization. First,hand gesture is segmented from the background based on skin color characteristics. Second,feature vectors are selected with equal intervals on the boundary of the gesture,and then gestures' length normalization is accomplished. Third,the fingertip positions are determined by the feature vectors' parameters,and angles of feature vectors are normalized. Finally the gestures are classified by support vector machine. The experimental results demonstrate that the proposed method can recognize 9 gestures with an accuracy of 94.1%.展开更多
This paper presents a method for hand gesture recognition based on 3D point cloud. Digital image processing technology is used in this research. Based on the 3D point from depth camera, the system firstly extracts som...This paper presents a method for hand gesture recognition based on 3D point cloud. Digital image processing technology is used in this research. Based on the 3D point from depth camera, the system firstly extracts some raw data of the hand. After the data segmentation and preprocessing, three kinds of appearance features are extracted, including the number of stretched fingers, the angles between fingers and the gesture region’s area distribution feature. Based on these features, the system implements the identification of the gestures by using decision tree method. The results of experiment demonstrate that the proposed method is pretty efficient to recognize common gestures with a high accuracy.展开更多
In recent years,gesture recognition has been widely used in the fields of intelligent driving,virtual reality,and human-computer interaction.With the development of artificial intelligence,deep learning has achieved r...In recent years,gesture recognition has been widely used in the fields of intelligent driving,virtual reality,and human-computer interaction.With the development of artificial intelligence,deep learning has achieved remarkable success in computer vision.To help researchers better understanding the development status of gesture recognition in video,this article provides a detailed survey of the latest developments in gesture recognition technology for videos based on deep learning.The reviewed methods are broadly categorized into three groups based on the type of neural networks used for recognition:two stream convolutional neural networks,3D convolutional neural networks,and Long-short Term Memory(LSTM)networks.In this review,we discuss the advantages and limitations of existing technologies,focusing on the feature extraction method of the spatiotemporal structure information in a video sequence,and consider future research directions.展开更多
Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream.Many researchers have been working on visionbase...Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream.Many researchers have been working on visionbased gesture recognition due to its various applications.This paper proposes a deep learning architecture based on the combination of a 3D Convolutional Neural Network(3D-CNN)and a Long Short-Term Memory(LSTM)network.The proposed architecture extracts spatial-temporal information from video sequences input while avoiding extensive computation.The 3D-CNN is used for the extraction of spectral and spatial features which are then given to the LSTM network through which classification is carried out.The proposed model is a light-weight architecture with only 3.7 million training parameters.The model has been evaluated on 15 classes from the 20BN-jester dataset available publicly.The model was trained on 2000 video-clips per class which were separated into 80%training and 20%validation sets.An accuracy of 99%and 97%was achieved on training and testing data,respectively.We further show that the combination of 3D-CNN with LSTM gives superior results as compared to MobileNetv2+LSTM.展开更多
Due to the function of gestures to convey information,gesture recognition plays a more and more important part in human-computer interaction.Traditional methods to recognize gestures are mostly device-based,which mean...Due to the function of gestures to convey information,gesture recognition plays a more and more important part in human-computer interaction.Traditional methods to recognize gestures are mostly device-based,which means users need to contact the devices.To overcome the inconvenience of the device-based methods,studies on device-free gesture recognition have been conducted.However,computer vision methods bring privacy issues and light interference problems.Therefore,we turn to wireless technology.In this paper,we propose a device-free in-air gesture recognition method based on radio frequency identification(RFID)tag array.By capturing the signals reflected by gestures,we can extract the gesture features.For dynamic gestures,both temporal and spatial features need to be considered.For static gestures,spatial feature is the key,for which a neural network is adopted to recognize the gestures.Experiments show that the accuracy of dynamic gesture recognition on the test set is 92.17%,while the accuracy of static ones is 91.67%.展开更多
文摘In the digital age,non-touch communication technologies are reshaping human-device interactions and raising security concerns.A major challenge in current technology is the misinterpretation of gestures by sensors and cameras,often caused by environmental factors.This issue has spurred the need for advanced data processing methods to achieve more accurate gesture recognition and predictions.Our study presents a novel virtual keyboard allowing character input via distinct hand gestures,focusing on two key aspects:hand gesture recognition and character input mechanisms.We developed a novel model with LSTM and fully connected layers for enhanced sequential data processing and hand gesture recognition.We also integrated CNN,max-pooling,and dropout layers for improved spatial feature extraction.This model architecture processes both temporal and spatial aspects of hand gestures,using LSTM to extract complex patterns from frame sequences for a comprehensive understanding of input data.Our unique dataset,essential for training the model,includes 1,662 landmarks from dynamic hand gestures,33 postures,and 468 face landmarks,all captured in real-time using advanced pose estimation.The model demonstrated high accuracy,achieving 98.52%in hand gesture recognition and over 97%in character input across different scenarios.Its excellent performance in real-time testing underlines its practicality and effectiveness,marking a significant advancement in enhancing human-device interactions in the digital age.
文摘Machine learning is a technique for analyzing data that aids the construction of mathematical models.Because of the growth of the Internet of Things(IoT)and wearable sensor devices,gesture interfaces are becoming a more natural and expedient human-machine interaction method.This type of artificial intelligence that requires minimal or no direct human intervention in decision-making is predicated on the ability of intelligent systems to self-train and detect patterns.The rise of touch-free applications and the number of deaf people have increased the significance of hand gesture recognition.Potential applications of hand gesture recognition research span from online gaming to surgical robotics.The location of the hands,the alignment of the fingers,and the hand-to-body posture are the fundamental components of hierarchical emotions in gestures.Linguistic gestures may be difficult to distinguish from nonsensical motions in the field of gesture recognition.Linguistic gestures may be difficult to distinguish from nonsensical motions in the field of gesture recognition.In this scenario,it may be difficult to overcome segmentation uncertainty caused by accidental hand motions or trembling.When a user performs the same dynamic gesture,the hand shapes and speeds of each user,as well as those often generated by the same user,vary.A machine-learning-based Gesture Recognition Framework(ML-GRF)for recognizing the beginning and end of a gesture sequence in a continuous stream of data is suggested to solve the problem of distinguishing between meaningful dynamic gestures and scattered generation.We have recommended using a similarity matching-based gesture classification approach to reduce the overall computing cost associated with identifying actions,and we have shown how an efficient feature extraction method can be used to reduce the thousands of single gesture information to four binary digit gesture codes.The findings from the simulation support the accuracy,precision,gesture recognition,sensitivity,and efficiency rates.The Machine Learning-based Gesture Recognition Framework(ML-GRF)had an accuracy rate of 98.97%,a precision rate of 97.65%,a gesture recognition rate of 98.04%,a sensitivity rate of 96.99%,and an efficiency rate of 95.12%.
基金funded by the National Key Research and Development Program of China(2017YFB1303200)NSFC(81871444,62071241,62075098,and 62001240)+1 种基金Leading-Edge Technology and Basic Research Program of Jiangsu(BK20192004D)Jiangsu Graduate Scientific Research Innovation Programme(KYCX20_1391,KYCX21_1557).
文摘In this article,to reduce the complexity and improve the generalization ability of current gesture recognition systems,we propose a novel SE-CNN attention architecture for sEMG-based hand gesture recognition.The proposed algorithm introduces a temporal squeeze-and-excite block into a simple CNN architecture and then utilizes it to recalibrate the weights of the feature outputs from the convolutional layer.By enhancing important features while suppressing useless ones,the model realizes gesture recognition efficiently.The last procedure of the proposed algorithm is utilizing a simple attention mechanism to enhance the learned representations of sEMG signals to performmulti-channel sEMG-based gesture recognition tasks.To evaluate the effectiveness and accuracy of the proposed algorithm,we conduct experiments involving multi-gesture datasets Ninapro DB4 and Ninapro DB5 for both inter-session validation and subject-wise cross-validation.After a series of comparisons with the previous models,the proposed algorithm effectively increases the robustness with improved gesture recognition performance and generalization ability.
文摘Appearance-based dynamic Hand Gesture Recognition(HGR)remains a prominent area of research in Human-Computer Interaction(HCI).Numerous environmental and computational constraints limit its real-time deployment.In addition,the performance of a model decreases as the subject’s distance from the camera increases.This study proposes a 3D separable Convolutional Neural Network(CNN),considering the model’s computa-tional complexity and recognition accuracy.The 20BN-Jester dataset was used to train the model for six gesture classes.After achieving the best offline recognition accuracy of 94.39%,the model was deployed in real-time while considering the subject’s attention,the instant of performing a gesture,and the subject’s distance from the camera.Despite being discussed in numerous research articles,the distance factor remains unresolved in real-time deployment,which leads to degraded recognition results.In the proposed approach,the distance calculation substantially improves the classification performance by reducing the impact of the subject’s distance from the camera.Additionally,the capability of feature extraction,degree of relevance,and statistical significance of the proposed model against other state-of-the-art models were validated using t-distributed Stochastic Neighbor Embedding(t-SNE),Mathew’s Correlation Coefficient(MCC),and the McNemar test,respectively.We observed that the proposed model exhibits state-of-the-art outcomes and a comparatively high significance level.
基金supported by the Development of Sleep Disordered Breathing Detection and Auxiliary Regulation System Project(No.2019I1009)。
文摘The surface electromyography(sEMG)is one of the basic processing techniques to the gesture recognition because of its inherent advantages of easy collection and non-invasion.However,limited by feature extraction and classifier selection,the adaptability and accuracy of the conventional machine learning still need to promote with the increase of the input dimension and the number of output classifications.Moreover,due to the different characteristics of sEMG data and image data,the conventional convolutional neural network(CNN)have yet to fit sEMG signals.In this paper,a novel hybrid model combining CNN with the graph convolutional network(GCN)was constructed to improve the performance of the gesture recognition.Based on the characteristics of sEMG signal,GCN was introduced into the model through a joint voting network to extract the muscle synergy feature of the sEMG signal.Such strategy optimizes the structure and convolution kernel parameters of the residual network(ResNet)with the classification accuracy on the NinaPro DBl up to 90.07%.The experimental results and comparisons confirm the superiority of the proposed hybrid model for gesture recognition from the sEMG signals.
基金This work was supported,in part,by the National Nature Science Foundation of China under grant numbers 62272236in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually use convolutional neural networks to extract features directly from raw gesture data for gesture recognition,but the networks are affected by much interference information in the input data and thus fit to some unimportant features.In this paper,we proposed a novel method for encoding spatio-temporal information,which can enhance the key features required for gesture recognition,such as shape,structure,contour,position and hand motion of gestures,thereby improving the accuracy of gesture recognition.This encoding method can encode arbitrarily multiple frames of gesture data into a single frame of the spatio-temporal feature map and use the spatio-temporal feature map as the input to the neural network.This can guide the model to fit important features while avoiding the use of complex recurrent network structures to extract temporal features.In addition,we designed two sub-networks and trained the model using a sub-network pre-training strategy that trains the sub-networks first and then the entire network,so as to avoid the subnetworks focusing too much on the information of a single category feature and being overly influenced by each other’s features.Experimental results on two public gesture datasets show that the proposed spatio-temporal information encoding method achieves advanced accuracy.
基金The authors extend their appreciation to the King Salman centre for Disability Research for funding this work through Research Group no KSRG-2022-017.
文摘Sign language recognition can be treated as one of the efficient solu-tions for disabled people to communicate with others.It helps them to convey the required data by the use of sign language with no issues.The latest develop-ments in computer vision and image processing techniques can be accurately uti-lized for the sign recognition process by disabled people.American Sign Language(ASL)detection was challenging because of the enhancing intraclass similarity and higher complexity.This article develops a new Bayesian Optimiza-tion with Deep Learning-Driven Hand Gesture Recognition Based Sign Language Communication(BODL-HGRSLC)for Disabled People.The BODL-HGRSLC technique aims to recognize the hand gestures for disabled people’s communica-tion.The presented BODL-HGRSLC technique integrates the concepts of compu-ter vision(CV)and DL models.In the presented BODL-HGRSLC technique,a deep convolutional neural network-based residual network(ResNet)model is applied for feature extraction.Besides,the presented BODL-HGRSLC model uses Bayesian optimization for the hyperparameter tuning process.At last,a bidir-ectional gated recurrent unit(BiGRU)model is exploited for the HGR procedure.A wide range of experiments was conducted to demonstrate the enhanced perfor-mance of the presented BODL-HGRSLC model.The comprehensive comparison study reported the improvements of the BODL-HGRSLC model over other DL models with maximum accuracy of 99.75%.
基金This research work was supported by the National Research Foundation of Korea(NRF)grant funded by the Korean government(MSIT)(NRF-2022R1A2C1004657).
文摘Hand Gesture Recognition(HGR)is a promising research area with an extensive range of applications,such as surgery,video game techniques,and sign language translation,where sign language is a complicated structured form of hand gestures.The fundamental building blocks of structured expressions in sign language are the arrangement of the fingers,the orientation of the hand,and the hand’s position concerning the body.The importance of HGR has increased due to the increasing number of touchless applications and the rapid growth of the hearing-impaired population.Therefore,real-time HGR is one of the most effective interaction methods between computers and humans.Developing a user-free interface with good recognition performance should be the goal of real-time HGR systems.Nowadays,Convolutional Neural Network(CNN)shows great recognition rates for different image-level classification tasks.It is challenging to train deep CNN networks like VGG-16,VGG-19,Inception-v3,and Efficientnet-B0 from scratch because only some significant labeled image datasets are available for static hand gesture images.However,an efficient and robust hand gesture recognition system of sign language employing finetuned Inception-v3 and Efficientnet-Bo network is proposed to identify hand gestures using a comparative small HGR dataset.Experiments show that Inception-v3 achieved 90%accuracy and 0.93%precision,0.91%recall,and 0.90%f1-score,respectively,while EfficientNet-B0 achieved 99%accuracy and 0.98%,0.97%,0.98%,precision,recall,and f1-score respectively.
基金supported by a grant (2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation (NRF)funded by the Ministry of Education,Republic of Korea.
文摘Hand gesture recognition (HGR) is used in a numerous applications,including medical health-care, industrial purpose and sports detection.We have developed a real-time hand gesture recognition system using inertialsensors for the smart home application. Developing such a model facilitatesthe medical health field (elders or disabled ones). Home automation has alsobeen proven to be a tremendous benefit for the elderly and disabled. Residentsare admitted to smart homes for comfort, luxury, improved quality of life,and protection against intrusion and burglars. This paper proposes a novelsystem that uses principal component analysis, linear discrimination analysisfeature extraction, and random forest as a classifier to improveHGRaccuracy.We have achieved an accuracy of 94% over the publicly benchmarked HGRdataset. The proposed system can be used to detect hand gestures in thehealthcare industry as well as in the industrial and educational sectors.
文摘Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning network for hand gesture recognition.The network integrates several well-proved modules together to learn both short-term and long-term features from video inputs and meanwhile avoid intensive computation.To learn short-term features,each video input is segmented into a fixed number of frame groups.A frame is randomly selected from each group and represented as an RGB image as well as an optical flow snapshot.These two entities are fused and fed into a convolutional neural network(Conv Net)for feature extraction.The Conv Nets for all groups share parameters.To learn longterm features,outputs from all Conv Nets are fed into a long short-term memory(LSTM)network,by which a final classification result is predicted.The new model has been tested with two popular hand gesture datasets,namely the Jester dataset and Nvidia dataset.Comparing with other models,our model produced very competitive results.The robustness of the new model has also been proved with an augmented dataset with enhanced diversity of hand gestures.
基金supported by the National Natural Science Foundation of China(61773272,61976191)the Six Talent Peaks Project of Jiangsu Province,China(XYDXX-053)Suzhou Research Project of Technical Innovation,Jiangsu,China(SYG201711)。
文摘Hand gesture recognition is a popular topic in computer vision and makes human-computer interaction more flexible and convenient.The representation of hand gestures is critical for recognition.In this paper,we propose a new method to measure the similarity between hand gestures and exploit it for hand gesture recognition.The depth maps of hand gestures captured via the Kinect sensors are used in our method,where the 3D hand shapes can be segmented from the cluttered backgrounds.To extract the pattern of salient 3D shape features,we propose a new descriptor-3D Shape Context,for 3D hand gesture representation.The 3D Shape Context information of each 3D point is obtained in multiple scales because both local shape context and global shape distribution are necessary for recognition.The description of all the 3D points constructs the hand gesture representation,and hand gesture recognition is explored via dynamic time warping algorithm.Extensive experiments are conducted on multiple benchmark datasets.The experimental results verify that the proposed method is robust to noise,articulated variations,and rigid transformations.Our method outperforms state-of-the-art methods in the comparisons of accuracy and efficiency.
基金supported in part by the National Natural Science Foundation of China under Grant 61461013in part of the Natural Science Foundation of Guangxi Province under Grant 2018GXNSFAA281179in part of the Dean Project of Guangxi Key Laboratory of Wireless Broadband Communication and Signal Processing under Grant GXKL06160103.
文摘In this study,we developed a system based on deep space–time neural networks for gesture recognition.When users change or the number of gesture categories increases,the accuracy of gesture recognition decreases considerably because most gesture recognition systems cannot accommodate both user differentiation and gesture diversity.To overcome the limitations of existing methods,we designed a onedimensional parallel long short-term memory–fully convolutional network(LSTM–FCN)model to extract gesture features of different dimensions.LSTM can learn complex time dynamic information,whereas FCN can predict gestures efficiently by extracting the deep,abstract features of gestures in the spatial dimension.In the experiment,50 types of gestures of five users were collected and evaluated.The experimental results demonstrate the effectiveness of this system and robustness to various gestures and individual changes.Statistical analysis of the recognition results indicated that an average accuracy of approximately 98.9% was achieved.
基金This work was supported by National Natural Science Foundation of China under grants U1933104 and 62071081LiaoNing Revitalization Talents Program under grant XLYC1807019,Liaoning Province Natural Science Foundation under grants 2019-MS-058+1 种基金Dalian Science and Technology Innovation Foundation under grant 2018J12GX044Fundamental Research Funds for the Central Universities under grants DUT20LAB113 and DUT20JC07,and Cooperative Scientific Research Project of Chunhui Plan of Ministry of Education.
文摘Device-free gesture recognition is an emerging wireless sensing technique which could recognize gestures by analyzing its influence on surrounding wireless signals,it may empower wireless networks with the augmented sensing ability.Researchers have made great achievements for singleperson device-free gesture recognition.However,when multiple persons conduct gestures simultaneously,the received signals will be mixed together,and thus traditional methods would not work well anymore.Moreover,the anonymity of persons and the change in the surrounding environment would cause feature shift and mismatch,and thus the recognition accuracy would degrade remarkably.To address these problems,we explore and exploit the diversity of spatial information and propose a multidimensional analysis method to separate the gesture feature of each person using a focusing sensing strategy.Meanwhile,we also present a deep-learning based robust device free gesture recognition framework,which leverages an adversarial approach to extract robust gesture feature that is insensitive to the change of persons and environment.Furthermore,we also develop a 77GHz mmWave prototype system and evaluate the proposed methods extensively.Experimental results reveal that the proposed system can achieve average accuracies of 93%and 84%when 10 gestures are conducted in Received:Jun.18,2020 Revised:Aug.06,2020 Editor:Ning Ge different environments by two and four persons simultaneously,respectively.
基金National Key R&D Program of China(No.2016YFB1001401)
文摘Aiming at the diversity of hand gesture traces by different people,the article presents novel method called cluster dynamic time warping( CDTW),which is based on the main axis classification and sample clustering of individuals. This method shows good performance on reducing the complexity of recognition and strong robustness of individuals. Data acquisition is implemented on a triaxial accelerometer with 100 Hz sampling frequency. A database of 2400 traces was created by ten subjects for the system testing and evaluation. The overall accuracy was found to be 98. 84% for user independent gesture recognition and 96. 7% for user dependent gesture recognition,higher than dynamic time warping( DTW),derivative DTW( DDTW) and piecewise DTW( PDTW) methods.Computation cost of CDTW in this project has been reduced 11 520 times compared with DTW.
文摘This paper introduces a human gesture recognition algorithm using an impulse radio ultra-wide- band (IR-UWB) radar sensor. Human gesture recognition has been one of the hottest research topics for quite a long time. Many gesture recognition algorithms or systems using other sensors have been proposed such as using cameras, RFID tags and so on. Among which gesture recognition systems using cameras have been extensively studied in past years and widely used in practical. While it might show some deficiencies in some cases. For example, the users might not like to be filmed by cameras considering their privacies. Besides, it might not work well in very dark environments. While RFID tags could be inconvenient to many people and are likely to be lost. Our gesture recognition algorithm uses IR-UWB radar sensor which has pretty high resolution in ranging and adjustable gesture recognition range, meanwhile, does not have problems in privacy issues or darkness. In this paper, the gesture recognition algorithm is based on the moving direction and distance change of the human hand and the change of the frontal surface area of hand towards radar sensor. By combining these changes while doing gestures, the algorithm may recognize basically 6 kinds of hand gestures. The experimental results show that these gestures are of quite good performance. The performance analysis from experiments is also given.
基金Supported by the National Natural Science Foundation of China (60873269)
文摘A hand gesture recognition method is presented for human-computer interaction,which is based on fingertip localization. First,hand gesture is segmented from the background based on skin color characteristics. Second,feature vectors are selected with equal intervals on the boundary of the gesture,and then gestures' length normalization is accomplished. Third,the fingertip positions are determined by the feature vectors' parameters,and angles of feature vectors are normalized. Finally the gestures are classified by support vector machine. The experimental results demonstrate that the proposed method can recognize 9 gestures with an accuracy of 94.1%.
文摘This paper presents a method for hand gesture recognition based on 3D point cloud. Digital image processing technology is used in this research. Based on the 3D point from depth camera, the system firstly extracts some raw data of the hand. After the data segmentation and preprocessing, three kinds of appearance features are extracted, including the number of stretched fingers, the angles between fingers and the gesture region’s area distribution feature. Based on these features, the system implements the identification of the gestures by using decision tree method. The results of experiment demonstrate that the proposed method is pretty efficient to recognize common gestures with a high accuracy.
基金the National Key R&D Program of China(2018YFC0807500)the National Natural Science Foundation of China(61772396,61772392,62002271,61902296)+3 种基金the Fundamental Research Funds for the Central Universities(JBF180301,XJS210310,XJS190307)Xi'an Key Laboratory of Big Data and Intelligent Vision(201805053ZD4CG37)the National Natural Science Foundation of Shaanxi Province(2020JQ-330,2020JM-195)the China Postdoctoral Science Foundation(2019M663640).
文摘In recent years,gesture recognition has been widely used in the fields of intelligent driving,virtual reality,and human-computer interaction.With the development of artificial intelligence,deep learning has achieved remarkable success in computer vision.To help researchers better understanding the development status of gesture recognition in video,this article provides a detailed survey of the latest developments in gesture recognition technology for videos based on deep learning.The reviewed methods are broadly categorized into three groups based on the type of neural networks used for recognition:two stream convolutional neural networks,3D convolutional neural networks,and Long-short Term Memory(LSTM)networks.In this review,we discuss the advantages and limitations of existing technologies,focusing on the feature extraction method of the spatiotemporal structure information in a video sequence,and consider future research directions.
文摘Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream.Many researchers have been working on visionbased gesture recognition due to its various applications.This paper proposes a deep learning architecture based on the combination of a 3D Convolutional Neural Network(3D-CNN)and a Long Short-Term Memory(LSTM)network.The proposed architecture extracts spatial-temporal information from video sequences input while avoiding extensive computation.The 3D-CNN is used for the extraction of spectral and spatial features which are then given to the LSTM network through which classification is carried out.The proposed model is a light-weight architecture with only 3.7 million training parameters.The model has been evaluated on 15 classes from the 20BN-jester dataset available publicly.The model was trained on 2000 video-clips per class which were separated into 80%training and 20%validation sets.An accuracy of 99%and 97%was achieved on training and testing data,respectively.We further show that the combination of 3D-CNN with LSTM gives superior results as compared to MobileNetv2+LSTM.
基金National Natural Science Foundation of China under Grant Nos.61902175,61872174 and 61832008Natural Sci⁃ence Foundation of China under Grant No.BK20190293.
文摘Due to the function of gestures to convey information,gesture recognition plays a more and more important part in human-computer interaction.Traditional methods to recognize gestures are mostly device-based,which means users need to contact the devices.To overcome the inconvenience of the device-based methods,studies on device-free gesture recognition have been conducted.However,computer vision methods bring privacy issues and light interference problems.Therefore,we turn to wireless technology.In this paper,we propose a device-free in-air gesture recognition method based on radio frequency identification(RFID)tag array.By capturing the signals reflected by gestures,we can extract the gesture features.For dynamic gestures,both temporal and spatial features need to be considered.For static gestures,spatial feature is the key,for which a neural network is adopted to recognize the gestures.Experiments show that the accuracy of dynamic gesture recognition on the test set is 92.17%,while the accuracy of static ones is 91.67%.