期刊文献+
共找到63篇文章
< 1 2 4 >
每页显示 20 50 100
Workout Action Recognition in Video Streams Using an Attention Driven Residual DC-GRU Network
1
作者 Arnab Dey Samit Biswas Dac-Nhuong Le 《Computers, Materials & Continua》 SCIE EI 2024年第5期3067-3087,共21页
Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions i... Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd dataset serves as a catalyst for the development ofmore robust and effective fitnesstracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis. 展开更多
关键词 Workout action recognition video stream action recognition residual network GRU ATTENTION
下载PDF
BCCLR:A Skeleton-Based Action Recognition with Graph Convolutional Network Combining Behavior Dependence and Context Clues
2
作者 Yunhe Wang Yuxin Xia Shuai Liu 《Computers, Materials & Continua》 SCIE EI 2024年第3期4489-4507,共19页
In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal ... In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal graph.Most GCNs define the graph topology by physical relations of the human joints.However,this predefined graph ignores the spatial relationship between non-adjacent joint pairs in special actions and the behavior dependence between joint pairs,resulting in a low recognition rate for specific actions with implicit correlation between joint pairs.In addition,existing methods ignore the trend correlation between adjacent frames within an action and context clues,leading to erroneous action recognition with similar poses.Therefore,this study proposes a learnable GCN based on behavior dependence,which considers implicit joint correlation by constructing a dynamic learnable graph with extraction of specific behavior dependence of joint pairs.By using the weight relationship between the joint pairs,an adaptive model is constructed.It also designs a self-attention module to obtain their inter-frame topological relationship for exploring the context of actions.Combining the shared topology and the multi-head self-attention map,the module obtains the context-based clue topology to update the dynamic graph convolution,achieving accurate recognition of different actions with similar poses.Detailed experiments on public datasets demonstrate that the proposed method achieves better results and realizes higher quality representation of actions under various evaluation protocols compared to state-of-the-art methods. 展开更多
关键词 action recognition deep learning GCN behavior dependence context clue self-attention
下载PDF
HgaNets:Fusion of Visual Data and Skeletal Heatmap for Human Gesture Action Recognition
3
作者 Wuyan Liang Xiaolong Xu 《Computers, Materials & Continua》 SCIE EI 2024年第4期1089-1103,共15页
Recognition of human gesture actions is a challenging issue due to the complex patterns in both visual andskeletal features. Existing gesture action recognition (GAR) methods typically analyze visual and skeletal data... Recognition of human gesture actions is a challenging issue due to the complex patterns in both visual andskeletal features. Existing gesture action recognition (GAR) methods typically analyze visual and skeletal data,failing to meet the demands of various scenarios. Furthermore, multi-modal approaches lack the versatility toefficiently process both uniformand disparate input patterns.Thus, in this paper, an attention-enhanced pseudo-3Dresidual model is proposed to address the GAR problem, called HgaNets. This model comprises two independentcomponents designed formodeling visual RGB (red, green and blue) images and 3Dskeletal heatmaps, respectively.More specifically, each component consists of two main parts: 1) a multi-dimensional attention module forcapturing important spatial, temporal and feature information in human gestures;2) a spatiotemporal convolutionmodule that utilizes pseudo-3D residual convolution to characterize spatiotemporal features of gestures. Then,the output weights of the two components are fused to generate the recognition results. Finally, we conductedexperiments on four datasets to assess the efficiency of the proposed model. The results show that the accuracy onfour datasets reaches 85.40%, 91.91%, 94.70%, and 95.30%, respectively, as well as the inference time is 0.54 s andthe parameters is 2.74M. These findings highlight that the proposed model outperforms other existing approachesin terms of recognition accuracy. 展开更多
关键词 Gesture action recognition multi-dimensional attention pseudo-3D skeletal heatmap
下载PDF
Improved Shark Smell Optimization Algorithm for Human Action Recognition 被引量:2
4
作者 Inzamam Mashood Nasir Mudassar Raza +3 位作者 Jamal Hussain Shah Muhammad Attique Khan Yun-Cheol Nam Yunyoung Nam 《Computers, Materials & Continua》 SCIE EI 2023年第9期2667-2684,共18页
Human Action Recognition(HAR)in uncontrolled environments targets to recognition of different actions froma video.An effective HAR model can be employed for an application like human-computer interaction,health care,p... Human Action Recognition(HAR)in uncontrolled environments targets to recognition of different actions froma video.An effective HAR model can be employed for an application like human-computer interaction,health care,person tracking,and video surveillance.Machine Learning(ML)approaches,specifically,Convolutional Neural Network(CNN)models had beenwidely used and achieved impressive results through feature fusion.The accuracy and effectiveness of these models continue to be the biggest challenge in this field.In this article,a novel feature optimization algorithm,called improved Shark Smell Optimization(iSSO)is proposed to reduce the redundancy of extracted features.This proposed technique is inspired by the behavior ofwhite sharks,and howthey find the best prey in thewhole search space.The proposed iSSOalgorithmdivides the FeatureVector(FV)into subparts,where a search is conducted to find optimal local features fromeach subpart of FV.Once local optimal features are selected,a global search is conducted to further optimize these features.The proposed iSSO algorithm is employed on nine(9)selected CNN models.These CNN models are selected based on their top-1 and top-5 accuracy in ImageNet competition.To evaluate the model,two publicly available datasets UCF-Sports and Hollywood2 are selected. 展开更多
关键词 action recognition improved shark smell optimization convolutional neural networks machine learning
下载PDF
HybridHR-Net:Action Recognition in Video Sequences Using Optimal Deep Learning Fusion Assisted Framework 被引量:1
5
作者 Muhammad Naeem Akbar Seemab Khan +3 位作者 Muhammad Umar Farooq Majed Alhaisoni Usman Tariq Muhammad Usman Akram 《Computers, Materials & Continua》 SCIE EI 2023年第9期3275-3295,共21页
The combination of spatiotemporal videos and essential features can improve the performance of human action recognition(HAR);however,the individual type of features usually degrades the performance due to similar acti... The combination of spatiotemporal videos and essential features can improve the performance of human action recognition(HAR);however,the individual type of features usually degrades the performance due to similar actions and complex backgrounds.The deep convolutional neural network has improved performance in recent years for several computer vision applications due to its spatial information.This article proposes a new framework called for video surveillance human action recognition dubbed HybridHR-Net.On a few selected datasets,deep transfer learning is used to pre-trained the EfficientNet-b0 deep learning model.Bayesian optimization is employed for the tuning of hyperparameters of the fine-tuned deep model.Instead of fully connected layer features,we considered the average pooling layer features and performed two feature selection techniques-an improved artificial bee colony and an entropy-based approach.Using a serial nature technique,the features that were selected are combined into a single vector,and then the results are categorized by machine learning classifiers.Five publically accessible datasets have been utilized for the experimental approach and obtained notable accuracy of 97%,98.7%,100%,99.7%,and 96.8%,respectively.Additionally,a comparison of the proposed framework with contemporarymethods is done to demonstrate the increase in accuracy. 展开更多
关键词 action recognition ENTROPY deep learning transfer learning artificial bee colony feature fusion
下载PDF
Fine-Grained Action Recognition Based on Temporal Pyramid Excitation Network 被引量:1
6
作者 Xuan Zhou Jianping Yi 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期2103-2116,共14页
Mining more discriminative temporal features to enrich temporal context representation is considered the key to fine-grained action recog-nition.Previous action recognition methods utilize a fixed spatiotemporal windo... Mining more discriminative temporal features to enrich temporal context representation is considered the key to fine-grained action recog-nition.Previous action recognition methods utilize a fixed spatiotemporal window to learn local video representation.However,these methods failed to capture complex motion patterns due to their limited receptive field.To solve the above problems,this paper proposes a lightweight Temporal Pyramid Excitation(TPE)module to capture the short,medium,and long-term temporal context.In this method,Temporal Pyramid(TP)module can effectively expand the temporal receptive field of the network by using the multi-temporal kernel decomposition without significantly increasing the computational cost.In addition,the Multi Excitation module can emphasize temporal importance to enhance the temporal feature representation learning.TPE can be integrated into ResNet50,and building a compact video learning framework-TPENet.Extensive validation experiments on several challenging benchmark(Something-Something V1,Something-Something V2,UCF-101,and HMDB51)datasets demonstrate that our method achieves a preferable balance between computation and accuracy. 展开更多
关键词 Fine-grained action recognition temporal pyramid excitation module temporal receptive multi-excitation module
下载PDF
Using BlazePose on Spatial Temporal Graph Convolutional Networks for Action Recognition
7
作者 Motasem S.Alsawadi El-Sayed M.El-kenawy Miguel Rio 《Computers, Materials & Continua》 SCIE EI 2023年第1期19-36,共18页
The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extrac... The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extract knowledge from these sources is imperative.Recently,the BlazePose system has been released for skeleton extraction from images oriented to mobile devices.With this skeleton graph representation in place,a Spatial-Temporal Graph Convolutional Network can be implemented to predict the action.We hypothesize that just by changing the skeleton input data for a different set of joints that offers more information about the action of interest,it is possible to increase the performance of the Spatial-Temporal Graph Convolutional Network for HAR tasks.Hence,in this study,we present the first implementation of the BlazePose skeleton topology upon this architecture for action recognition.Moreover,we propose the Enhanced-BlazePose topology that can achieve better results than its predecessor.Additionally,we propose different skeleton detection thresholds that can improve the accuracy performance even further.We reached a top-1 accuracy performance of 40.1%on the Kinetics dataset.For the NTU-RGB+D dataset,we achieved 87.59%and 92.1%accuracy for Cross-Subject and Cross-View evaluation criteria,respectively. 展开更多
关键词 action recognition BlazePose graph neural network OpenPose SKELETON spatial temporal graph convolution network
下载PDF
Action Recognition and Detection Based on Deep Learning: A Comprehensive Summary
8
作者 Yong Li Qiming Liang +1 位作者 Bo Gan Xiaolong Cui 《Computers, Materials & Continua》 SCIE EI 2023年第10期1-23,共23页
Action recognition and detection is an important research topic in computer vision,which can be divided into action recognition and action detection.At present,the distinction between action recognition and action det... Action recognition and detection is an important research topic in computer vision,which can be divided into action recognition and action detection.At present,the distinction between action recognition and action detection is not clear,and the relevant reviews are not comprehensive.Thus,this paper summarized the action recognition and detection methods and datasets based on deep learning to accurately present the research status in this field.Firstly,according to the way that temporal and spatial features are extracted from the model,the commonly used models of action recognition are divided into the two stream models,the temporal models,the spatiotemporal models and the transformer models according to the architecture.And this paper briefly analyzes the characteristics of the four models and introduces the accuracy of various algorithms in common data sets.Then,from the perspective of tasks to be completed,action detection is further divided into temporal action detection and spatiotemporal action detection,and commonly used datasets are introduced.From the perspectives of the twostage method and one-stage method,various algorithms of temporal action detection are reviewed,and the various algorithms of spatiotemporal action detection are summarized in detail.Finally,the relationship between different parts of action recognition and detection is discussed,the difficulties faced by the current research are summarized in detail,and future development was prospected。 展开更多
关键词 action recognition action detection deep learning convolutional neural networks DATASET
下载PDF
Two-Stream Deep Learning Architecture-Based Human Action Recognition
9
作者 Faheem Shehzad Muhammad Attique Khan +5 位作者 Muhammad Asfand E.Yar Muhammad Sharif Majed Alhaisoni Usman Tariq Arnab Majumdar Orawit Thinnukool 《Computers, Materials & Continua》 SCIE EI 2023年第3期5931-5949,共19页
Human action recognition(HAR)based on Artificial intelligence reasoning is the most important research area in computer vision.Big breakthroughs in this field have been observed in the last few years;additionally,the ... Human action recognition(HAR)based on Artificial intelligence reasoning is the most important research area in computer vision.Big breakthroughs in this field have been observed in the last few years;additionally,the interest in research in this field is evolving,such as understanding of actions and scenes,studying human joints,and human posture recognition.Many HAR techniques are introduced in the literature.Nonetheless,the challenge of redundant and irrelevant features reduces recognition accuracy.They also faced a few other challenges,such as differing perspectives,environmental conditions,and temporal variations,among others.In this work,a deep learning and improved whale optimization algorithm based framework is proposed for HAR.The proposed framework consists of a few core stages i.e.,frames initial preprocessing,fine-tuned pre-trained deep learning models through transfer learning(TL),features fusion using modified serial based approach,and improved whale optimization based best features selection for final classification.Two pre-trained deep learning models such as InceptionV3 and Resnet101 are fine-tuned and TL is employed to train on action recognition datasets.The fusion process increases the length of feature vectors;therefore,improved whale optimization algorithm is proposed and selects the best features.The best selected features are finally classified usingmachine learning(ML)classifiers.Four publicly accessible datasets such as Ut-interaction,Hollywood,Free Viewpoint Action Recognition usingMotion History Volumes(IXMAS),and centre of computer vision(UCF)Sports,are employed and achieved the testing accuracy of 100%,99.9%,99.1%,and 100%respectively.Comparison with state of the art techniques(SOTA),the proposed method showed the improved accuracy. 展开更多
关键词 Human action recognition deep learning transfer learning fusion of multiple features features optimization
下载PDF
HRNetO:Human Action Recognition Using Unified Deep Features Optimization Framework
10
作者 Tehseen Ahsan Sohail Khalid +3 位作者 Shaheryar Najam Muhammad Attique Khan Ye Jin Kim Byoungchol Chang 《Computers, Materials & Continua》 SCIE EI 2023年第4期1089-1105,共17页
Human action recognition(HAR)attempts to understand a subject’sbehavior and assign a label to each action performed.It is more appealingbecause it has a wide range of applications in computer vision,such asvideo surv... Human action recognition(HAR)attempts to understand a subject’sbehavior and assign a label to each action performed.It is more appealingbecause it has a wide range of applications in computer vision,such asvideo surveillance and smart cities.Many attempts have been made in theliterature to develop an effective and robust framework for HAR.Still,theprocess remains difficult and may result in reduced accuracy due to severalchallenges,such as similarity among actions,extraction of essential features,and reduction of irrelevant features.In this work,we proposed an end-toendframework using deep learning and an improved tree seed optimizationalgorithm for accurate HAR.The proposed design consists of a fewsignificantsteps.In the first step,frame preprocessing is performed.In the second step,two pre-trained deep learning models are fine-tuned and trained throughdeep transfer learning using preprocessed video frames.In the next step,deeplearning features of both fine-tuned models are fused using a new ParallelStandard Deviation Padding Max Value approach.The fused features arefurther optimized using an improved tree seed algorithm,and select the bestfeatures are finally classified by using the machine learning classifiers.Theexperiment was carried out on five publicly available datasets,including UTInteraction,Weizmann,KTH,Hollywood,and IXAMS,and achieved higheraccuracy than previous techniques. 展开更多
关键词 action recognition features fusion deep learning features selection
下载PDF
Advanced Guided Whale Optimization Algorithm for Feature Selection in BlazePose Action Recognition
11
作者 Motasem S.Alsawadi El-Sayed M.El-kenawy Miguel Rio 《Intelligent Automation & Soft Computing》 SCIE 2023年第9期2767-2782,共16页
The BlazePose,which models human body skeletons as spatiotem-poral graphs,has achieved fantastic performance in skeleton-based action identification.Skeleton extraction from photos for mobile devices has been made pos... The BlazePose,which models human body skeletons as spatiotem-poral graphs,has achieved fantastic performance in skeleton-based action identification.Skeleton extraction from photos for mobile devices has been made possible by the BlazePose system.A Spatial-Temporal Graph Con-volutional Network(STGCN)can then forecast the actions.The Spatial-Temporal Graph Convolutional Network(STGCN)can be improved by simply replacing the skeleton input data with a different set of joints that provide more information about the activity of interest.On the other hand,existing approaches require the user to manually set the graph’s topology and then fix it across all input layers and samples.This research shows how to use the Statistical Fractal Search(SFS)-Guided whale optimization algorithm(GWOA).To get the best solution for the GWOA,we adopt the SFS diffusion algorithm,which uses the random walk with a Gaussian distribution method common to growing systems.Continuous values are transformed into binary to apply to the feature-selection problem in conjunction with the BlazePose skeletal topology and stochastic fractal search to construct a novel implementation of the BlazePose topology for action recognition.In our experiments,we employed the Kinetics and the NTU-RGB+D datasets.The achieved actiona accuracy in the X-View is 93.14%and in the X-Sub is 96.74%.In addition,the proposed model performs better in numerous statistical tests such as the Analysis of Variance(ANOVA),Wilcoxon signed-rank test,histogram,and times analysis. 展开更多
关键词 BlazePose metaheuristics convolutional networks feature selection action recognition
下载PDF
MSF-Net: A Multilevel Spatiotemporal Feature Fusion Network Combines Attention for Action Recognition
12
作者 Mengmeng Yan Chuang Zhang +3 位作者 Jinqi Chu Haichao Zhang Tao Ge Suting Chen 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1433-1449,共17页
An action recognition network that combines multi-level spatiotemporal feature fusion with an attention mechanism is proposed as a solution to the issues of single spatiotemporal feature scale extraction,information r... An action recognition network that combines multi-level spatiotemporal feature fusion with an attention mechanism is proposed as a solution to the issues of single spatiotemporal feature scale extraction,information redundancy,and insufficient extraction of frequency domain information in channels in 3D convolutional neural networks.Firstly,based on 3D CNN,this paper designs a new multilevel spatiotemporal feature fusion(MSF)structure,which is embedded in the network model,mainly through multilevel spatiotemporal feature separation,splicing and fusion,to achieve the fusion of spatial perceptual fields and short-medium-long time series information at different scales with reduced network parameters;In the second step,a multi-frequency channel and spatiotemporal attention module(FSAM)is introduced to assign different frequency features and spatiotemporal features in the channels are assigned corresponding weights to reduce the information redundancy of the feature maps.Finally,we embed the proposed method into the R3D model,which replaced the 2D convolutional filters in the 2D Resnet with 3D convolutional filters and conduct extensive experimental validation on the small and medium-sized dataset UCF101 and the largesized dataset Kinetics-400.The findings revealed that our model increased the recognition accuracy on both datasets.Results on the UCF101 dataset,in particular,demonstrate that our model outperforms R3D in terms of a maximum recognition accuracy improvement of 7.2%while using 34.2%fewer parameters.The MSF and FSAM are migrated to another traditional 3D action recognition model named C3D for application testing.The test results based on UCF101 show that the recognition accuracy is improved by 8.9%,proving the strong generalization ability and universality of the method in this paper. 展开更多
关键词 3D convolutional neural network action recognition MSF FSAM
下载PDF
Human Action Recognition Based on Supervised Class-Specific Dictionary Learning with Deep Convolutional Neural Network Features 被引量:6
13
作者 Binjie Gu 《Computers, Materials & Continua》 SCIE EI 2020年第4期243-262,共20页
Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The ma... Human action recognition under complex environment is a challenging work.Recently,sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions.The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class,and the minimal reconstruction error indicates its corresponding class.However,how to learn a discriminative dictionary is still a difficult work.In this work,we make two contributions.First,we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network(CNN)features.Secondly,we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term.Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models. 展开更多
关键词 action recognition deep CNN features sparse model supervised dictionary learning
下载PDF
Hidden Two-Stream Collaborative Learning Network for Action Recognition 被引量:4
14
作者 Shuren Zhou Le Chen Vijayan Sugumaran 《Computers, Materials & Continua》 SCIE EI 2020年第6期1545-1561,共17页
The two-stream convolutional neural network exhibits excellent performance in the video action recognition.The crux of the matter is to use the frames already clipped by the videos and the optical flow images pre-extr... The two-stream convolutional neural network exhibits excellent performance in the video action recognition.The crux of the matter is to use the frames already clipped by the videos and the optical flow images pre-extracted by the frames,to train a model each,and to finally integrate the outputs of the two models.Nevertheless,the reliance on the pre-extraction of the optical flow impedes the efficiency of action recognition,and the temporal and the spatial streams are just simply fused at the ends,with one stream failing and the other stream succeeding.We propose a novel hidden two-stream collaborative(HTSC)learning network that masks the steps of extracting the optical flow in the network and greatly speeds up the action recognition.Based on the two-stream method,the two-stream collaborative learning model captures the interaction of the temporal and spatial features to greatly enhance the accuracy of recognition.Our proposed method is highly capable of achieving the balance of efficiency and precision on large-scale video action recognition datasets. 展开更多
关键词 action recognition collaborative learning optical flow
下载PDF
Multi-Layered Deep Learning Features Fusion for Human Action Recognition 被引量:2
15
作者 Sadia Kiran Muhammad Attique Khan +5 位作者 Muhammad Younus Javed Majed Alhaisoni Usman Tariq Yunyoung Nam Robertas Damaševicius Muhammad Sharif 《Computers, Materials & Continua》 SCIE EI 2021年第12期4061-4075,共15页
Human Action Recognition(HAR)is an active research topic in machine learning for the last few decades.Visual surveillance,robotics,and pedestrian detection are the main applications for action recognition.Computer vis... Human Action Recognition(HAR)is an active research topic in machine learning for the last few decades.Visual surveillance,robotics,and pedestrian detection are the main applications for action recognition.Computer vision researchers have introduced many HAR techniques,but they still face challenges such as redundant features and the cost of computing.In this article,we proposed a new method for the use of deep learning for HAR.In the proposed method,video frames are initially pre-processed using a global contrast approach and later used to train a deep learning model using domain transfer learning.The Resnet-50 Pre-Trained Model is used as a deep learning model in this work.Features are extracted from two layers:Global Average Pool(GAP)and Fully Connected(FC).The features of both layers are fused by the Canonical Correlation Analysis(CCA).Then features are selected using the Shanon Entropy-based threshold function.The selected features are finally passed to multiple classifiers for final classification.Experiments are conducted on five publicly available datasets as IXMAS,UCF Sports,YouTube,UT-Interaction,and KTH.The accuracy of these data sets was 89.6%,99.7%,100%,96.7%and 96.6%,respectively.Comparison with existing techniques has shown that the proposed method provides improved accuracy for HAR.Also,the proposed method is computationally fast based on the time of execution. 展开更多
关键词 action recognition transfer learning features fusion features selection CLASSIFICATION
下载PDF
Research on Action Recognition and Content Analysis in Videos Based on DNN and MLN 被引量:2
16
作者 Wei Song Jing Yu +1 位作者 Xiaobing Zhao Antai Wang 《Computers, Materials & Continua》 SCIE EI 2019年第9期1189-1204,共16页
In the current era of multimedia information,it is increasingly urgent to realize intelligent video action recognition and content analysis.In the past few years,video action recognition,as an important direction in c... In the current era of multimedia information,it is increasingly urgent to realize intelligent video action recognition and content analysis.In the past few years,video action recognition,as an important direction in computer vision,has attracted many researchers and made much progress.First,this paper reviews the latest video action recognition methods based on Deep Neural Network and Markov Logic Network.Second,we analyze the characteristics of each method and the performance from the experiment results.Then compare the emphases of these methods and discuss the application scenarios.Finally,we consider and prospect the development trend and direction of this field. 展开更多
关键词 Video action recognition deep learning network markov logic network
下载PDF
Human-Object Interaction Recognition Based on Modeling Context 被引量:1
17
作者 Shuyang Li Wei Liang Qun Zhang 《Journal of Beijing Institute of Technology》 EI CAS 2017年第2期215-222,共8页
This paper proposes a method to recognize human-object interactions by modeling context between human actions and interacted objects.Human-object interaction recognition is a challenging task due to severe occlusion b... This paper proposes a method to recognize human-object interactions by modeling context between human actions and interacted objects.Human-object interaction recognition is a challenging task due to severe occlusion between human and objects during the interacting process.Since that human actions and interacted objects provide strong context information,i.e.some actions are usually related to some specific objects,the accuracy of recognition is significantly improved for both of them.Through the proposed method,both global and local temporal features from skeleton sequences are extracted to model human actions.In the meantime,kernel features are utilized to describe interacted objects.Finally,all possible solutions from actions and objects are optimized by modeling the context between them.The results of experiments demonstrate the effectiveness of our method. 展开更多
关键词 human-object interaction action recognition object recognition modeling context
下载PDF
Video Analytics Framework for Human Action Recognition 被引量:1
18
作者 Muhammad Attique Khan Majed Alhaisoni +4 位作者 Ammar Armghan Fayadh Alenezi Usman Tariq Yunyoung Nam Tallha Akram 《Computers, Materials & Continua》 SCIE EI 2021年第9期3841-3859,共19页
Human action recognition(HAR)is an essential but challenging task for observing human movements.This problem encompasses the observations of variations in human movement and activity identification by machine learning... Human action recognition(HAR)is an essential but challenging task for observing human movements.This problem encompasses the observations of variations in human movement and activity identification by machine learning algorithms.This article addresses the challenges in activity recognition by implementing and experimenting an intelligent segmentation,features reduction and selection framework.A novel approach has been introduced for the fusion of segmented frames and multi-level features of interests are extracted.An entropy-skewness based features reduction technique has been implemented and the reduced features are converted into a codebook by serial based fusion.A custom made genetic algorithm is implemented on the constructed features codebook in order to select the strong and wellknown features.The features are exploited by a multi-class SVM for action identification.Comprehensive experimental results are undertaken on four action datasets,namely,Weizmann,KTH,Muhavi,and WVU multi-view.We achieved the recognition rate of 96.80%,100%,100%,and 100%respectively.Analysis reveals that the proposed action recognition approach is efficient and well accurate as compare to existing approaches. 展开更多
关键词 Video analytics action recognition features classification ENTROPY data analytic
下载PDF
Multi-Modality Video Representation for Action Recognition 被引量:4
19
作者 Chao Zhu Yike Wang +3 位作者 Dongbing Pu Miao Qi Hui Sun Lei Tan 《Journal on Big Data》 2020年第3期95-104,共10页
Nowadays,action recognition is widely applied in many fields.However,action is hard to define by single modality information.The difference between image recognition and action recognition is that action recognition n... Nowadays,action recognition is widely applied in many fields.However,action is hard to define by single modality information.The difference between image recognition and action recognition is that action recognition needs more modality information to depict one action,such as the appearance,the motion and the dynamic information.Due to the state of action evolves with the change of time,motion information must be considered when representing an action.Most of current methods define an action by spatial information and motion information.There are two key elements of current action recognition methods:spatial information achieved by sampling sparsely on video frames’sequence and the motion content mostly represented by the optical flow which is calculated on consecutive video frames.However,the relevance between them in current methods is weak.Therefore,to strengthen the associativity,this paper presents a new architecture consisted of three streams to obtain multi-modality information.The advantages of our network are:(a)We propose a new sampling approach to sample evenly on the video sequence for acquiring the appearance information;(b)We utilize ResNet101 for gaining high-level and distinguished features;(c)We advance a three-stream architecture to capture temporal,spatial and dynamic information.Experimental results on UCF101 dataset illustrate that our method outperforms other previous methods. 展开更多
关键词 action recognition dynamic APPEARANCE SPATIAL MOTION ResNet101 UCF101
下载PDF
3-Dimensional Bag of Visual Words Framework on Action Recognition 被引量:1
20
作者 Shiqi Wang Yimin Yang +1 位作者 Ruizhong Wei Qingming Jonathan Wu 《Computers, Materials & Continua》 SCIE EI 2020年第6期1081-1091,共11页
Human motion recognition plays a crucial role in the video analysis framework.However,a given video may contain a variety of noises,such as an unstable background and redundant actions,that are completely different fr... Human motion recognition plays a crucial role in the video analysis framework.However,a given video may contain a variety of noises,such as an unstable background and redundant actions,that are completely different from the key actions.These noises pose a great challenge to human motion recognition.To solve this problem,we propose a new method based on the 3-Dimensional(3D)Bag of Visual Words(BoVW)framework.Our method includes two parts:The first part is the video action feature extractor,which can identify key actions by analyzing action features.In the video action encoder,by analyzing the action characteristics of a given video,we use the deep 3D CNN pre-trained model to obtain expressive coding information.A classifier with subnetwork nodes is used for the final classification.The extensive experiments demonstrate that our method leads to an impressive effect on complex video analysis.Our approach achieves state-of-the-art performance on the datasets of UCF101(85.3%)and HMDB51(54.5%). 展开更多
关键词 action recognition 3D CNNs recurrent neural networks residual networks subnetwork nodes
下载PDF
上一页 1 2 4 下一页 到第
使用帮助 返回顶部