Activity and motion recognition using Wi-Fi signals,mainly channel state information(CSI),has captured the interest of many researchers in recent years.Many research studies have achieved splendid results with the hel...Activity and motion recognition using Wi-Fi signals,mainly channel state information(CSI),has captured the interest of many researchers in recent years.Many research studies have achieved splendid results with the help of machine learning models from different applications such as healthcare services,sign language translation,security,context awareness,and the internet of things.Nevertheless,most of these adopted studies have some shortcomings in the machine learning algorithms as they rely on recurrence and convolutions and,thus,precluding smooth sequential computation.Therefore,in this paper,we propose a deep-learning approach based solely on attention,i.e.,the sole Self-Attention Mechanism model(Sole-SAM),for activity and motion recognition using Wi-Fi signals.The Sole-SAM was deployed to learn the features representing different activities and motions from the raw CSI data.Experiments were carried out to evaluate the performance of the proposed Sole-SAM architecture.The experimental results indicated that our proposed system took significantly less time to train than models that rely on recurrence and convolutions like Long Short-Term Memory(LSTM)and Recurrent Neural Network(RNN).Sole-SAM archived a 0.94%accuracy level,which is 0.04%better than RNN and 0.02%better than LSTM.展开更多
With the adoption of cutting-edge communication technologies such as 5G/6G systems and the extensive development of devices,crowdsensing systems in the Internet of Things(IoT)are now conducting complicated video analy...With the adoption of cutting-edge communication technologies such as 5G/6G systems and the extensive development of devices,crowdsensing systems in the Internet of Things(IoT)are now conducting complicated video analysis tasks such as behaviour recognition.These applications have dramatically increased the diversity of IoT systems.Specifically,behaviour recognition in videos usually requires a combinatorial analysis of the spatial information about objects and information about their dynamic actions in the temporal dimension.Behaviour recognition may even rely more on the modeling of temporal information containing short-range and long-range motions,in contrast to computer vision tasks involving images that focus on understanding spatial information.However,current solutions fail to jointly and comprehensively analyse short-range motions between adjacent frames and long-range temporal aggregations at large scales in videos.In this paper,we propose a novel behaviour recognition method based on the integration of multigranular(IMG)motion features,which can provide support for deploying video analysis in multimedia IoT crowdsensing systems.In particular,we achieve reliable motion information modeling by integrating a channel attention-based short-term motion feature enhancement module(CSEM)and a cascaded long-term motion feature integration module(CLIM).We evaluate our model on several action recognition benchmarks,such as HMDB51,Something-Something and UCF101.The experimental results demonstrate that our approach outperforms the previous state-of-the-art methods,which confirms its effective-ness and efficiency.展开更多
Due to the dynamic stiffness characteristics of human joints, it is easy to cause impact and disturbance on normal movements during exoskeleton assistance. This not only brings strict requirements for exoskeleton cont...Due to the dynamic stiffness characteristics of human joints, it is easy to cause impact and disturbance on normal movements during exoskeleton assistance. This not only brings strict requirements for exoskeleton control design, but also makes it difficult to improve assistive level. The Variable Stiffness Actuator (VSA), as a physical variable stiffness mechanism, has the characteristics of dynamic stiffness adjustment and high stiffness control bandwidth, which is in line with the stiffness matching experiment. However, there are still few works exploring the assistive human stiffness matching experiment based on VSA. Therefore, this paper designs a hip exoskeleton based on VSA actuator and studies CPG human motion phase recognition algorithm. Firstly, this paper puts forward the requirements of variable stiffness experimental design and the output torque and variable stiffness dynamic response standards based on human lower limb motion parameters. Plate springs are used as elastic elements to establish the mechanical principle of variable stiffness, and a small variable stiffness actuator is designed based on the plate spring. Then the corresponding theoretical dynamic model is established and analyzed. Starting from the CPG phase recognition algorithm, this paper uses perturbation theory to expand the first-order CPG unit, obtains the phase convergence equation and verifies the phase convergence when using hip joint angle as the input signal with the same frequency, and then expands the second-order CPG unit under the premise of circular limit cycle and analyzes the frequency convergence criterion. Afterwards, this paper extracts the plate spring modal from Abaqus and generates the neutral file of the flexible body model to import into Adams, and conducts torque-stiffness one-way loading and reciprocating loading experiments on the variable stiffness mechanism. After that, Simulink is used to verify the validity of the criterion. Finally, based on the above criterions, the signal mean value is removed using feedback structure to complete the phase recognition algorithm for the human hip joint angle signal, and the convergence is verified using actual human walking data on flat ground.展开更多
An object learning and recognition system is implemented for humanoid robots to discover and memorize objects only by simple interactions with non-expert users. When the object is presented, the system makes use of th...An object learning and recognition system is implemented for humanoid robots to discover and memorize objects only by simple interactions with non-expert users. When the object is presented, the system makes use of the motion information over consecutive frames to extract object features and implements machine learning based on the bag of visual words approach. Instead of using a local feature descriptor only, the proposed system uses the co-occurring local features in order to increase feature discriminative power for both object model learning and inference stages. For different objects with different textures, a hybrid sampling strategy is considered. This hybrid approach minimizes the consumption of computation resources and helps achieving good performances demonstrated on a set of a dozen different daily objects.展开更多
The hands and face are the most important parts for expressing sign language morphemes in sign language videos.However,we find that existing Continuous Sign Language Recognition(CSLR)methods lack the mining of hand an...The hands and face are the most important parts for expressing sign language morphemes in sign language videos.However,we find that existing Continuous Sign Language Recognition(CSLR)methods lack the mining of hand and face information in visual backbones or use expensive and time-consuming external extractors to explore this information.In addition,the signs have different lengths,whereas previous CSLR methods typically use a fixed-length window to segment the video to capture sequential features and then perform global temporal modeling,which disturbs the perception of complete signs.In this study,we propose a Multi-Scale Context-Aware network(MSCA-Net)to solve the aforementioned problems.Our MSCA-Net contains two main modules:(1)Multi-Scale Motion Attention(MSMA),which uses the differences among frames to perceive information of the hands and face in multiple spatial scales,replacing the heavy feature extractors;and(2)Multi-Scale Temporal Modeling(MSTM),which explores crucial temporal information in the sign language video from different temporal scales.We conduct extensive experiments using three widely used sign language datasets,i.e.,RWTH-PHOENIX-Weather-2014,RWTH-PHOENIX-Weather-2014T,and CSL-Daily.The proposed MSCA-Net achieve state-of-the-art performance,demonstrating the effectiveness of our approach.展开更多
Along with the development of motion capture technique, more and more 3D motion databases become available. In this paper, a novel approach is presented for motion recognition and retrieval based on ensemble HMM (hidd...Along with the development of motion capture technique, more and more 3D motion databases become available. In this paper, a novel approach is presented for motion recognition and retrieval based on ensemble HMM (hidden Markov model) learning. Due to the high dimensionality of motion’s features, Isomap nonlinear dimension reduction is used for training data of ensemble HMM learning. For handling new motion data, Isomap is generalized based on the estimation of underlying eigen- functions. Then each action class is learned with one HMM. Since ensemble learning can effectively enhance supervised learning, ensembles of weak HMM learners are built. Experiment results showed that the approaches are effective for motion data recog- nition and retrieval.展开更多
In this paper, we have studied the waveforms of background noise in a seismograph and set up an AR model to characterize them. We then complete the modeling and the automatic recognition program. Finally, we provide t...In this paper, we have studied the waveforms of background noise in a seismograph and set up an AR model to characterize them. We then complete the modeling and the automatic recognition program. Finally, we provide the results from automatic recognition and the manual recognition of the first motion for 25 underground explosions.展开更多
Recognition of the human actions by computer vision has become an active research area in recent years. Due to the speed and the high similarity of the actions, the current algorithms cannot get high recognition rate....Recognition of the human actions by computer vision has become an active research area in recent years. Due to the speed and the high similarity of the actions, the current algorithms cannot get high recognition rate. A new recognition method of the human action is proposed with the multi-scale directed depth motion maps(MsdDMMs) and Log-Gabor filters. According to the difference between the speed and time order of an action, MsdDMMs is proposed under the energy framework. Meanwhile, Log-Gabor is utilized to describe the texture details of MsdDMMs for the motion characteristics. It can easily satisfy both the texture characterization and the visual features of human eye. Furthermore, the collaborative representation is employed as action recognition by the classification. Experimental results show that the proposed algorithm, which is applied in the MSRAction3 D dataset and MSRGesture3 D dataset, can achieve the accuracy of 95.79% and 96.43% respectively. It also has higher accuracy than the existing algorithms, such as super normal vector(SNV), hierarchical recurrent neural network(Hierarchical RNN).展开更多
Batch processing mode is widely used in the training process of human motiun recognition. After training, the motion elassitier usually remains invariable. However, if the classifier is to be expanded, all historical ...Batch processing mode is widely used in the training process of human motiun recognition. After training, the motion elassitier usually remains invariable. However, if the classifier is to be expanded, all historical data must be gathered for retraining. This consumes a huge amount of storage space, and the new training process will be more complicated. In this paper, we use an incremental learning method to model the motion classifier. A weighted decision tree is proposed to help illustrate the process, and the probability sampling method is also used. The resuhs show that with continuous learning, the motion classifier is more precise. The average classification precision for the weighted decision tree was 88.43% in a typical test. Incremental learning consumes much less time than the batch processing mode when the input training data comes continuously.展开更多
Cameras can reliably detect human motions in a normal environment, but they are usually affected by sudden illumination changes and complex conditions, which are the major obstacles to the reliability and robustness o...Cameras can reliably detect human motions in a normal environment, but they are usually affected by sudden illumination changes and complex conditions, which are the major obstacles to the reliability and robustness of the system. To solve this problem, a novel integration method was proposed to combine hi-static ultra-wideband radar and cameras. In this recognition system, two cameras are used to localize the object's region, regions while a radar is used to obtain its 3D motion models on a mobile robot. The recognition results can be matched in the 3D motion library in order to recognize its motions. To confirm the effectiveness of the proposed method, the experimental results of recognition using vision sensors and those of recognition using the integration method were compared in different environments. Higher correct-recognition rate is achieved in the experiment.展开更多
In this paper, we propose a low-cost posture recognition scheme using a single webcam for the signaling hand with nature sways and possible oc-clusions. It goes for developing the untouchable low-complexity utility ba...In this paper, we propose a low-cost posture recognition scheme using a single webcam for the signaling hand with nature sways and possible oc-clusions. It goes for developing the untouchable low-complexity utility based on friendly hand-posture signaling. The scheme integrates the dominant temporal-difference detection, skin color detection and morphological filtering for efficient cooperation in constructing the hand profile molds. Those molds provide representative hand profiles for more stable posture recognition than accurate hand shapes with in effect trivial details. The resultant bounding box of tracking the signaling molds can be treated as a regular-type object-matched ROI to facilitate the stable extraction of robust HOG features. With such commonly applied features on hand, the prototype SVM is adequately capable of obtaining fast and stable hand postures recognition under natural hand movement and non-hand object occlusion. Experimental results demonstrate that our scheme can achieve hand-posture recognition with enough accuracy under background clutters that the targeted hand can be allowed with medium movement and palm-grasped object. Hence, the proposed method can be easily embedded in the mobile phone as application software.展开更多
To improve the recognition performance of video human actions,an approach that models the video actions in a hierarchical way is proposed. This hierarchical model summarizes the action contents with different spatio-t...To improve the recognition performance of video human actions,an approach that models the video actions in a hierarchical way is proposed. This hierarchical model summarizes the action contents with different spatio-temporal domains according to the properties of human body movement.First,the temporal gradient combined with the constraint of coherent motion pattern is utilized to extract stable and dense motion features that are viewed as point features,then the mean-shift clustering algorithm with the adaptive scale kernel is used to label these features.After pooling the features with the same label to generate part-based representation,the visual word responses within one large scale volume are collected as video object representation.On the benchmark KTH(Kungliga Tekniska H?gskolan)and UCF (University of Central Florida)-sports action datasets,the experimental results show that the proposed method enhances the representative and discriminative power of action features, and improves recognition rates.Compared with other related literature,the proposed method obtains superior performance.展开更多
Nowadays,action recognition is widely applied in many fields.However,action is hard to define by single modality information.The difference between image recognition and action recognition is that action recognition n...Nowadays,action recognition is widely applied in many fields.However,action is hard to define by single modality information.The difference between image recognition and action recognition is that action recognition needs more modality information to depict one action,such as the appearance,the motion and the dynamic information.Due to the state of action evolves with the change of time,motion information must be considered when representing an action.Most of current methods define an action by spatial information and motion information.There are two key elements of current action recognition methods:spatial information achieved by sampling sparsely on video frames’sequence and the motion content mostly represented by the optical flow which is calculated on consecutive video frames.However,the relevance between them in current methods is weak.Therefore,to strengthen the associativity,this paper presents a new architecture consisted of three streams to obtain multi-modality information.The advantages of our network are:(a)We propose a new sampling approach to sample evenly on the video sequence for acquiring the appearance information;(b)We utilize ResNet101 for gaining high-level and distinguished features;(c)We advance a three-stream architecture to capture temporal,spatial and dynamic information.Experimental results on UCF101 dataset illustrate that our method outperforms other previous methods.展开更多
An essential part of any activity recognition system claiming be truly real-time is the ability to perform feature extraction in real-time. We present, in this paper, a quite simple and computationally tractable appro...An essential part of any activity recognition system claiming be truly real-time is the ability to perform feature extraction in real-time. We present, in this paper, a quite simple and computationally tractable approach for real-time human activity recognition that is based on simple statistical features. These features are simple and relatively small, accordingly they are easy and fast to be calculated, and further form a relatively low-dimensional feature space in which classification can be carried out robustly. On the Weizmann publicly benchmark dataset, promising results (i.e. 97.8%) have been achieved, showing the effectiveness of the proposed approach compared to the-state-of-the-art. Furthermore, the approach is quite fast and thus can provide timing guarantees to real-time applications.展开更多
BACKGROUND Frozen shoulder(FS)is a familiar disorder.Diabetics with FS have more severe symptoms and a worse prognosis.Thus,this study investigated the influence of enhancing dynamic scapular recognition on shoulder d...BACKGROUND Frozen shoulder(FS)is a familiar disorder.Diabetics with FS have more severe symptoms and a worse prognosis.Thus,this study investigated the influence of enhancing dynamic scapular recognition on shoulder disability and pain in diabetics with FS.CASE SUMMARY A Forty-five years-old male person with diabetes mellitus and a unilateral FS(stage II)for at least 3 mo with shoulder pain and limitation in both passive and active ranges of motion(ROMs)of the glenohumeral joint of≥25%in 2 directions participated in this study.This person received dynamic scapular recognition exercise was applied to a diabetic person with a unilateral FS(stage II).The main outcome measures were upward rotation of the scapula,shoulder pain and disability index,and shoulder range of motion of flexion,abduction,and external rotation.The dynamic scapular exercise was performed for 15 min/session and 3 sessions/wk lasted for 4 wk.After 4 wk of intervention,there were improvements between pre-treatment and post-treatment in shoulder pain,shoulder pain and disability index,shoulder ROM,and upward rotation of the scapula.CONCLUSION This case report suggested that enhancing dynamic scapular recognition may improve shoulder pain and disability;upward rotation of the scapula;and shoulder ROM of shoulder abduction,flexion,and external rotation after 4 wk.展开更多
Arabic Sign Language recognition is an emerging field of research. Previous attempts at automatic vision-based recog-nition of Arabic Sign Language mainly focused on finger spelling and recognizing isolated gestures. ...Arabic Sign Language recognition is an emerging field of research. Previous attempts at automatic vision-based recog-nition of Arabic Sign Language mainly focused on finger spelling and recognizing isolated gestures. In this paper we report the first continuous Arabic Sign Language by building on existing research in feature extraction and pattern recognition. The development of the presented work required collecting a continuous Arabic Sign Language database which we designed and recorded in cooperation with a sign language expert. We intend to make the collected database available for the research community. Our system which we based on spatio-temporal feature extraction and hidden Markov models has resulted in an average word recognition rate of 94%, keeping in the mind the use of a high perplex-ity vocabulary and unrestrictive grammar. We compare our proposed work against existing sign language techniques based on accumulated image difference and motion estimation. The experimental results section shows that the pro-posed work outperforms existing solutions in terms of recognition accuracy.展开更多
基金This work was supported by Foshan Science and Technology Innovation Special Fund Project(No.BK22BF004 and No.BK20AF004),Guangdong Province,China.
文摘Activity and motion recognition using Wi-Fi signals,mainly channel state information(CSI),has captured the interest of many researchers in recent years.Many research studies have achieved splendid results with the help of machine learning models from different applications such as healthcare services,sign language translation,security,context awareness,and the internet of things.Nevertheless,most of these adopted studies have some shortcomings in the machine learning algorithms as they rely on recurrence and convolutions and,thus,precluding smooth sequential computation.Therefore,in this paper,we propose a deep-learning approach based solely on attention,i.e.,the sole Self-Attention Mechanism model(Sole-SAM),for activity and motion recognition using Wi-Fi signals.The Sole-SAM was deployed to learn the features representing different activities and motions from the raw CSI data.Experiments were carried out to evaluate the performance of the proposed Sole-SAM architecture.The experimental results indicated that our proposed system took significantly less time to train than models that rely on recurrence and convolutions like Long Short-Term Memory(LSTM)and Recurrent Neural Network(RNN).Sole-SAM archived a 0.94%accuracy level,which is 0.04%better than RNN and 0.02%better than LSTM.
基金supported by National Natural Science Foundation of China under grant No.62271125,No.62273071Sichuan Science and Technology Program(No.2022YFG0038,No.2021YFG0018)+1 种基金by Xinjiang Science and Technology Program(No.2022273061)by the Fundamental Research Funds for the Central Universities(No.ZYGX2020ZB034,No.ZYGX2021J019).
文摘With the adoption of cutting-edge communication technologies such as 5G/6G systems and the extensive development of devices,crowdsensing systems in the Internet of Things(IoT)are now conducting complicated video analysis tasks such as behaviour recognition.These applications have dramatically increased the diversity of IoT systems.Specifically,behaviour recognition in videos usually requires a combinatorial analysis of the spatial information about objects and information about their dynamic actions in the temporal dimension.Behaviour recognition may even rely more on the modeling of temporal information containing short-range and long-range motions,in contrast to computer vision tasks involving images that focus on understanding spatial information.However,current solutions fail to jointly and comprehensively analyse short-range motions between adjacent frames and long-range temporal aggregations at large scales in videos.In this paper,we propose a novel behaviour recognition method based on the integration of multigranular(IMG)motion features,which can provide support for deploying video analysis in multimedia IoT crowdsensing systems.In particular,we achieve reliable motion information modeling by integrating a channel attention-based short-term motion feature enhancement module(CSEM)and a cascaded long-term motion feature integration module(CLIM).We evaluate our model on several action recognition benchmarks,such as HMDB51,Something-Something and UCF101.The experimental results demonstrate that our approach outperforms the previous state-of-the-art methods,which confirms its effective-ness and efficiency.
文摘Due to the dynamic stiffness characteristics of human joints, it is easy to cause impact and disturbance on normal movements during exoskeleton assistance. This not only brings strict requirements for exoskeleton control design, but also makes it difficult to improve assistive level. The Variable Stiffness Actuator (VSA), as a physical variable stiffness mechanism, has the characteristics of dynamic stiffness adjustment and high stiffness control bandwidth, which is in line with the stiffness matching experiment. However, there are still few works exploring the assistive human stiffness matching experiment based on VSA. Therefore, this paper designs a hip exoskeleton based on VSA actuator and studies CPG human motion phase recognition algorithm. Firstly, this paper puts forward the requirements of variable stiffness experimental design and the output torque and variable stiffness dynamic response standards based on human lower limb motion parameters. Plate springs are used as elastic elements to establish the mechanical principle of variable stiffness, and a small variable stiffness actuator is designed based on the plate spring. Then the corresponding theoretical dynamic model is established and analyzed. Starting from the CPG phase recognition algorithm, this paper uses perturbation theory to expand the first-order CPG unit, obtains the phase convergence equation and verifies the phase convergence when using hip joint angle as the input signal with the same frequency, and then expands the second-order CPG unit under the premise of circular limit cycle and analyzes the frequency convergence criterion. Afterwards, this paper extracts the plate spring modal from Abaqus and generates the neutral file of the flexible body model to import into Adams, and conducts torque-stiffness one-way loading and reciprocating loading experiments on the variable stiffness mechanism. After that, Simulink is used to verify the validity of the criterion. Finally, based on the above criterions, the signal mean value is removed using feedback structure to complete the phase recognition algorithm for the human hip joint angle signal, and the convergence is verified using actual human walking data on flat ground.
基金The National Natural Science Foundation of China(No.60672094,60971098)
文摘An object learning and recognition system is implemented for humanoid robots to discover and memorize objects only by simple interactions with non-expert users. When the object is presented, the system makes use of the motion information over consecutive frames to extract object features and implements machine learning based on the bag of visual words approach. Instead of using a local feature descriptor only, the proposed system uses the co-occurring local features in order to increase feature discriminative power for both object model learning and inference stages. For different objects with different textures, a hybrid sampling strategy is considered. This hybrid approach minimizes the consumption of computation resources and helps achieving good performances demonstrated on a set of a dozen different daily objects.
基金Supported by the National Natural Science Foundation of China(62072334).
文摘The hands and face are the most important parts for expressing sign language morphemes in sign language videos.However,we find that existing Continuous Sign Language Recognition(CSLR)methods lack the mining of hand and face information in visual backbones or use expensive and time-consuming external extractors to explore this information.In addition,the signs have different lengths,whereas previous CSLR methods typically use a fixed-length window to segment the video to capture sequential features and then perform global temporal modeling,which disturbs the perception of complete signs.In this study,we propose a Multi-Scale Context-Aware network(MSCA-Net)to solve the aforementioned problems.Our MSCA-Net contains two main modules:(1)Multi-Scale Motion Attention(MSMA),which uses the differences among frames to perceive information of the hands and face in multiple spatial scales,replacing the heavy feature extractors;and(2)Multi-Scale Temporal Modeling(MSTM),which explores crucial temporal information in the sign language video from different temporal scales.We conduct extensive experiments using three widely used sign language datasets,i.e.,RWTH-PHOENIX-Weather-2014,RWTH-PHOENIX-Weather-2014T,and CSL-Daily.The proposed MSCA-Net achieve state-of-the-art performance,demonstrating the effectiveness of our approach.
基金Project supported by the National Natural Science Foundation of China (Nos. 60533090 and 60525108), the National Basic Research Program (973) of China (No. 2002CB312101), and the Science and Technology Project of Zhejiang Province (Nos. 2005C13032 and 2005C11001-05), China
文摘Along with the development of motion capture technique, more and more 3D motion databases become available. In this paper, a novel approach is presented for motion recognition and retrieval based on ensemble HMM (hidden Markov model) learning. Due to the high dimensionality of motion’s features, Isomap nonlinear dimension reduction is used for training data of ensemble HMM learning. For handling new motion data, Isomap is generalized based on the estimation of underlying eigen- functions. Then each action class is learned with one HMM. Since ensemble learning can effectively enhance supervised learning, ensembles of weak HMM learners are built. Experiment results showed that the approaches are effective for motion data recog- nition and retrieval.
文摘In this paper, we have studied the waveforms of background noise in a seismograph and set up an AR model to characterize them. We then complete the modeling and the automatic recognition program. Finally, we provide the results from automatic recognition and the manual recognition of the first motion for 25 underground explosions.
基金Sponsored by the Jiangsu Prospective Joint Research Project(Grant No.BY2016022-28)
文摘Recognition of the human actions by computer vision has become an active research area in recent years. Due to the speed and the high similarity of the actions, the current algorithms cannot get high recognition rate. A new recognition method of the human action is proposed with the multi-scale directed depth motion maps(MsdDMMs) and Log-Gabor filters. According to the difference between the speed and time order of an action, MsdDMMs is proposed under the energy framework. Meanwhile, Log-Gabor is utilized to describe the texture details of MsdDMMs for the motion characteristics. It can easily satisfy both the texture characterization and the visual features of human eye. Furthermore, the collaborative representation is employed as action recognition by the classification. Experimental results show that the proposed algorithm, which is applied in the MSRAction3 D dataset and MSRGesture3 D dataset, can achieve the accuracy of 95.79% and 96.43% respectively. It also has higher accuracy than the existing algorithms, such as super normal vector(SNV), hierarchical recurrent neural network(Hierarchical RNN).
基金partly supported by the National Natural Science Foundation of China under Grant 61573242the Projects from Science and Technology Commission of Shanghai Municipality under Grant No.13511501302,No.14511100300,and No.15511105100+1 种基金Shanghai Pujiang Program under Grant No.14PJ1405000ZTE Industry-Academia-Research Cooperation Funds
文摘Batch processing mode is widely used in the training process of human motiun recognition. After training, the motion elassitier usually remains invariable. However, if the classifier is to be expanded, all historical data must be gathered for retraining. This consumes a huge amount of storage space, and the new training process will be more complicated. In this paper, we use an incremental learning method to model the motion classifier. A weighted decision tree is proposed to help illustrate the process, and the probability sampling method is also used. The resuhs show that with continuous learning, the motion classifier is more precise. The average classification precision for the weighted decision tree was 88.43% in a typical test. Incremental learning consumes much less time than the batch processing mode when the input training data comes continuously.
基金Supported by National Natural Science Foundation of China(No.50875193)
文摘Cameras can reliably detect human motions in a normal environment, but they are usually affected by sudden illumination changes and complex conditions, which are the major obstacles to the reliability and robustness of the system. To solve this problem, a novel integration method was proposed to combine hi-static ultra-wideband radar and cameras. In this recognition system, two cameras are used to localize the object's region, regions while a radar is used to obtain its 3D motion models on a mobile robot. The recognition results can be matched in the 3D motion library in order to recognize its motions. To confirm the effectiveness of the proposed method, the experimental results of recognition using vision sensors and those of recognition using the integration method were compared in different environments. Higher correct-recognition rate is achieved in the experiment.
文摘In this paper, we propose a low-cost posture recognition scheme using a single webcam for the signaling hand with nature sways and possible oc-clusions. It goes for developing the untouchable low-complexity utility based on friendly hand-posture signaling. The scheme integrates the dominant temporal-difference detection, skin color detection and morphological filtering for efficient cooperation in constructing the hand profile molds. Those molds provide representative hand profiles for more stable posture recognition than accurate hand shapes with in effect trivial details. The resultant bounding box of tracking the signaling molds can be treated as a regular-type object-matched ROI to facilitate the stable extraction of robust HOG features. With such commonly applied features on hand, the prototype SVM is adequately capable of obtaining fast and stable hand postures recognition under natural hand movement and non-hand object occlusion. Experimental results demonstrate that our scheme can achieve hand-posture recognition with enough accuracy under background clutters that the targeted hand can be allowed with medium movement and palm-grasped object. Hence, the proposed method can be easily embedded in the mobile phone as application software.
基金The National Natural Science Foundation of China(No.60971098,61201345)
文摘To improve the recognition performance of video human actions,an approach that models the video actions in a hierarchical way is proposed. This hierarchical model summarizes the action contents with different spatio-temporal domains according to the properties of human body movement.First,the temporal gradient combined with the constraint of coherent motion pattern is utilized to extract stable and dense motion features that are viewed as point features,then the mean-shift clustering algorithm with the adaptive scale kernel is used to label these features.After pooling the features with the same label to generate part-based representation,the visual word responses within one large scale volume are collected as video object representation.On the benchmark KTH(Kungliga Tekniska H?gskolan)and UCF (University of Central Florida)-sports action datasets,the experimental results show that the proposed method enhances the representative and discriminative power of action features, and improves recognition rates.Compared with other related literature,the proposed method obtains superior performance.
基金the National Natural Science Foundation of China(Nos.61672150,61907007)by the Fund of the Jilin Provincial Science and Technology Department Project(Nos.20180201089GX,20190201305JC)+1 种基金Provincial Department of Education Project(Nos.JJKH20190291KJ,JJKH20190294KJ,JJKH20190355KJ)the Fundamental Research Funds for the Central Universities(No.2412019FZ049).
文摘Nowadays,action recognition is widely applied in many fields.However,action is hard to define by single modality information.The difference between image recognition and action recognition is that action recognition needs more modality information to depict one action,such as the appearance,the motion and the dynamic information.Due to the state of action evolves with the change of time,motion information must be considered when representing an action.Most of current methods define an action by spatial information and motion information.There are two key elements of current action recognition methods:spatial information achieved by sampling sparsely on video frames’sequence and the motion content mostly represented by the optical flow which is calculated on consecutive video frames.However,the relevance between them in current methods is weak.Therefore,to strengthen the associativity,this paper presents a new architecture consisted of three streams to obtain multi-modality information.The advantages of our network are:(a)We propose a new sampling approach to sample evenly on the video sequence for acquiring the appearance information;(b)We utilize ResNet101 for gaining high-level and distinguished features;(c)We advance a three-stream architecture to capture temporal,spatial and dynamic information.Experimental results on UCF101 dataset illustrate that our method outperforms other previous methods.
文摘An essential part of any activity recognition system claiming be truly real-time is the ability to perform feature extraction in real-time. We present, in this paper, a quite simple and computationally tractable approach for real-time human activity recognition that is based on simple statistical features. These features are simple and relatively small, accordingly they are easy and fast to be calculated, and further form a relatively low-dimensional feature space in which classification can be carried out robustly. On the Weizmann publicly benchmark dataset, promising results (i.e. 97.8%) have been achieved, showing the effectiveness of the proposed approach compared to the-state-of-the-art. Furthermore, the approach is quite fast and thus can provide timing guarantees to real-time applications.
文摘BACKGROUND Frozen shoulder(FS)is a familiar disorder.Diabetics with FS have more severe symptoms and a worse prognosis.Thus,this study investigated the influence of enhancing dynamic scapular recognition on shoulder disability and pain in diabetics with FS.CASE SUMMARY A Forty-five years-old male person with diabetes mellitus and a unilateral FS(stage II)for at least 3 mo with shoulder pain and limitation in both passive and active ranges of motion(ROMs)of the glenohumeral joint of≥25%in 2 directions participated in this study.This person received dynamic scapular recognition exercise was applied to a diabetic person with a unilateral FS(stage II).The main outcome measures were upward rotation of the scapula,shoulder pain and disability index,and shoulder range of motion of flexion,abduction,and external rotation.The dynamic scapular exercise was performed for 15 min/session and 3 sessions/wk lasted for 4 wk.After 4 wk of intervention,there were improvements between pre-treatment and post-treatment in shoulder pain,shoulder pain and disability index,shoulder ROM,and upward rotation of the scapula.CONCLUSION This case report suggested that enhancing dynamic scapular recognition may improve shoulder pain and disability;upward rotation of the scapula;and shoulder ROM of shoulder abduction,flexion,and external rotation after 4 wk.
文摘Arabic Sign Language recognition is an emerging field of research. Previous attempts at automatic vision-based recog-nition of Arabic Sign Language mainly focused on finger spelling and recognizing isolated gestures. In this paper we report the first continuous Arabic Sign Language by building on existing research in feature extraction and pattern recognition. The development of the presented work required collecting a continuous Arabic Sign Language database which we designed and recorded in cooperation with a sign language expert. We intend to make the collected database available for the research community. Our system which we based on spatio-temporal feature extraction and hidden Markov models has resulted in an average word recognition rate of 94%, keeping in the mind the use of a high perplex-ity vocabulary and unrestrictive grammar. We compare our proposed work against existing sign language techniques based on accumulated image difference and motion estimation. The experimental results section shows that the pro-posed work outperforms existing solutions in terms of recognition accuracy.