Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions i...Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd dataset serves as a catalyst for the development ofmore robust and effective fitnesstracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis.展开更多
Electric power training is essential for ensuring the safety and reliability of the system.In this study,we introduce a novel Abnormal Action Recognition(AAR)system that utilizes a Lightweight Pose Estimation Network(...Electric power training is essential for ensuring the safety and reliability of the system.In this study,we introduce a novel Abnormal Action Recognition(AAR)system that utilizes a Lightweight Pose Estimation Network(LPEN)to efficiently and effectively detect abnormal fall-down and trespass incidents in electric power training scenarios.The LPEN network,comprising three stages—MobileNet,Initial Stage,and Refinement Stage—is employed to swiftly extract image features,detect human key points,and refine them for accurate analysis.Subsequently,a Pose-aware Action Analysis Module(PAAM)captures the positional coordinates of human skeletal points in each frame.Finally,an Abnormal Action Inference Module(AAIM)evaluates whether abnormal fall-down or unauthorized trespass behavior is occurring.For fall-down recognition,three criteria—falling speed,main angles of skeletal points,and the person’s bounding box—are considered.To identify unauthorized trespass,emphasis is placed on the position of the ankles.Extensive experiments validate the effectiveness and efficiency of the proposed system in ensuring the safety and reliability of electric power training.展开更多
We are writing in response to the article titled“Addressing the needs and rights of sex workers for HIV healthcare services in the Philippines”[1].The article calls for attention on the significant challenges faced ...We are writing in response to the article titled“Addressing the needs and rights of sex workers for HIV healthcare services in the Philippines”[1].The article calls for attention on the significant challenges faced by sex workers in the Philippines in accessing HIV healthcare.We appreciate the article’s effort to examine these issues in depth.We would like to present a constant flow of thoughts in this letter while highlighting the positive aspects,potential obstacles,and additional points that contribute to this ongoing discussion.展开更多
Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by cognitive impairments in the initial stage, which lead to severe cognitive dysfunction in the later stage. Action observation therapy (AOT) is...Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by cognitive impairments in the initial stage, which lead to severe cognitive dysfunction in the later stage. Action observation therapy (AOT) is a multisensory cognitive rehabilitation technique where the patient initially observes the actions and then tries to perform. The study aimed to examine the impact of AOT along with usual physiotherapy interventions to reduce depression, improve cognition and balance of a patient with AD. A 67 years old patient with AD was selected for this study because the patient has been suffering from depression, dementia, and physical dysfunction along with some other health conditions like diabetes and hypertension. Before starting intervention, a baseline assessment was done through the Beck Depression Inventory (BDI) tool, the Mini-Cog Scale, and the Berg Balance Scale (BBS). The patient received 12 sessions of AOT along with usual physiotherapy interventions thrice a week for four weeks, which included 45 minutes of each session. After four weeks of intervention, the patient demonstrated significant improvement in depression, cognition, and balance, whereas the BDI score declined from moderate 21/63 to mild 15/63 level of depression. The Mini-Cog score improved from 2/5 to 4/5, and the BBS score increased from 18/56 to 37/56. It is concluded that AOT along with usual physiotherapy intervention helps to reduce depression, improve cognition and balance of people with AD.展开更多
Laboratory safety is a critical area of broad societal concern,particularly in the detection of abnormal actions.To enhance the efficiency and accuracy of detecting such actions,this paper introduces a novel method ca...Laboratory safety is a critical area of broad societal concern,particularly in the detection of abnormal actions.To enhance the efficiency and accuracy of detecting such actions,this paper introduces a novel method called TubeRAPT(Tubelet Transformer based onAdapter and Prefix TrainingModule).Thismethod primarily comprises three key components:the TubeR network,an adaptive clustering attention mechanism,and a prefix training module.These components work in synergy to address the challenge of knowledge preservation in models pretrained on large datasets while maintaining training efficiency.The TubeR network serves as the backbone for spatio-temporal feature extraction,while the adaptive clustering attention mechanism refines the focus on relevant information.The prefix training module facilitates efficient fine-tuning and knowledge transfer.Experimental results demonstrate the effectiveness of TubeRAPT,achieving a 68.44%mean Average Precision(mAP)on the CLA(Crazy LabActivity)small-scale dataset,marking a significant improvement of 1.53%over the previous TubeR method.This research not only showcases the potential applications of TubeRAPT in the field of abnormal action detection but also offers innovative ideas and technical support for the future development of laboratory safety monitoring technologies.The proposed method has implications for improving safety management systems in various laboratory environments,potentially reducing accidents and enhancing overall workplace safety.展开更多
In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal ...In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal graph.Most GCNs define the graph topology by physical relations of the human joints.However,this predefined graph ignores the spatial relationship between non-adjacent joint pairs in special actions and the behavior dependence between joint pairs,resulting in a low recognition rate for specific actions with implicit correlation between joint pairs.In addition,existing methods ignore the trend correlation between adjacent frames within an action and context clues,leading to erroneous action recognition with similar poses.Therefore,this study proposes a learnable GCN based on behavior dependence,which considers implicit joint correlation by constructing a dynamic learnable graph with extraction of specific behavior dependence of joint pairs.By using the weight relationship between the joint pairs,an adaptive model is constructed.It also designs a self-attention module to obtain their inter-frame topological relationship for exploring the context of actions.Combining the shared topology and the multi-head self-attention map,the module obtains the context-based clue topology to update the dynamic graph convolution,achieving accurate recognition of different actions with similar poses.Detailed experiments on public datasets demonstrate that the proposed method achieves better results and realizes higher quality representation of actions under various evaluation protocols compared to state-of-the-art methods.展开更多
This study presents a general optimal trajectory planning(GOTP)framework for autonomous vehicles(AVs)that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently.Firstly,we emplo...This study presents a general optimal trajectory planning(GOTP)framework for autonomous vehicles(AVs)that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently.Firstly,we employ the fifth-order Bezier curve to generate and smooth the reference path along the road centerline.Cartesian coordinates are then transformed to achieve the curvature continuity of the generated curve.Considering the road constraints and vehicle dynamics,limited polynomial candidate trajectories are generated and smoothed in a curvilinear coordinate system.Furthermore,in selecting the optimal trajectory,we develop a unified and auto-tune objective function based on the principle of least action by employing AVs to simulate drivers’behavior and summarizing their manipulation characteristics of“seeking benefits and avoiding losses.”Finally,by integrating the idea of receding-horizon optimization,the proposed framework is achieved by considering dynamic multi-performance objectives and selecting trajectories that satisfy feasibility,optimality,and adaptability.Extensive simulations and experiments are performed,and the results demonstrate the framework’s feasibility and effectiveness,which avoids both dynamic and static obstacles and applies to various scenarios with multi-source interactive traffic participants.Moreover,we prove that the proposed method can guarantee real-time planning and safety requirements compared to drivers’manipulation.展开更多
Recognition of human gesture actions is a challenging issue due to the complex patterns in both visual andskeletal features. Existing gesture action recognition (GAR) methods typically analyze visual and skeletal data...Recognition of human gesture actions is a challenging issue due to the complex patterns in both visual andskeletal features. Existing gesture action recognition (GAR) methods typically analyze visual and skeletal data,failing to meet the demands of various scenarios. Furthermore, multi-modal approaches lack the versatility toefficiently process both uniformand disparate input patterns.Thus, in this paper, an attention-enhanced pseudo-3Dresidual model is proposed to address the GAR problem, called HgaNets. This model comprises two independentcomponents designed formodeling visual RGB (red, green and blue) images and 3Dskeletal heatmaps, respectively.More specifically, each component consists of two main parts: 1) a multi-dimensional attention module forcapturing important spatial, temporal and feature information in human gestures;2) a spatiotemporal convolutionmodule that utilizes pseudo-3D residual convolution to characterize spatiotemporal features of gestures. Then,the output weights of the two components are fused to generate the recognition results. Finally, we conductedexperiments on four datasets to assess the efficiency of the proposed model. The results show that the accuracy onfour datasets reaches 85.40%, 91.91%, 94.70%, and 95.30%, respectively, as well as the inference time is 0.54 s andthe parameters is 2.74M. These findings highlight that the proposed model outperforms other existing approachesin terms of recognition accuracy.展开更多
Water decoupling charge blasting excels in rock breaking,relying on its uniform pressure transmission and low energy dissipation.The water decoupling coefficients can adjust the contributions of the stress wave and qu...Water decoupling charge blasting excels in rock breaking,relying on its uniform pressure transmission and low energy dissipation.The water decoupling coefficients can adjust the contributions of the stress wave and quasi-static pressure.However,the quantitative relationship between the two contributions is unclear,and it is difficult to provide reasonable theoretical support for the design of water decoupling blasting.In this study,a theoretical model of blasting fracturing partitioning is established.The mechanical mechanism and determination method of the optimal decoupling coefficient are obtained.The reliability is verified through model experiments and a field test.The results show that with the increasing of decoupling coefficient,the rock breaking ability of blasting dynamic action decreases,while quasi-static action increases and then decreases.The ability of quasi-static action to wedge into cracks changes due to the spatial adjustment of the blast hole and crushed zone.The quasi-static action plays a leading role in the fracturing range,determining an optimal decoupling coefficient.The optimal water decoupling coefficient is not a fixed value,which can be obtained by the proposed theoretical model.Compared with the theoretical results,the maximum error in the model experiment results is 8.03%,and the error in the field test result is 3.04%.展开更多
文摘Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd dataset serves as a catalyst for the development ofmore robust and effective fitnesstracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis.
基金supportted by Natural Science Foundation of Jiangsu Province(No.BK20230696).
文摘Electric power training is essential for ensuring the safety and reliability of the system.In this study,we introduce a novel Abnormal Action Recognition(AAR)system that utilizes a Lightweight Pose Estimation Network(LPEN)to efficiently and effectively detect abnormal fall-down and trespass incidents in electric power training scenarios.The LPEN network,comprising three stages—MobileNet,Initial Stage,and Refinement Stage—is employed to swiftly extract image features,detect human key points,and refine them for accurate analysis.Subsequently,a Pose-aware Action Analysis Module(PAAM)captures the positional coordinates of human skeletal points in each frame.Finally,an Abnormal Action Inference Module(AAIM)evaluates whether abnormal fall-down or unauthorized trespass behavior is occurring.For fall-down recognition,three criteria—falling speed,main angles of skeletal points,and the person’s bounding box—are considered.To identify unauthorized trespass,emphasis is placed on the position of the ankles.Extensive experiments validate the effectiveness and efficiency of the proposed system in ensuring the safety and reliability of electric power training.
文摘We are writing in response to the article titled“Addressing the needs and rights of sex workers for HIV healthcare services in the Philippines”[1].The article calls for attention on the significant challenges faced by sex workers in the Philippines in accessing HIV healthcare.We appreciate the article’s effort to examine these issues in depth.We would like to present a constant flow of thoughts in this letter while highlighting the positive aspects,potential obstacles,and additional points that contribute to this ongoing discussion.
文摘Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by cognitive impairments in the initial stage, which lead to severe cognitive dysfunction in the later stage. Action observation therapy (AOT) is a multisensory cognitive rehabilitation technique where the patient initially observes the actions and then tries to perform. The study aimed to examine the impact of AOT along with usual physiotherapy interventions to reduce depression, improve cognition and balance of a patient with AD. A 67 years old patient with AD was selected for this study because the patient has been suffering from depression, dementia, and physical dysfunction along with some other health conditions like diabetes and hypertension. Before starting intervention, a baseline assessment was done through the Beck Depression Inventory (BDI) tool, the Mini-Cog Scale, and the Berg Balance Scale (BBS). The patient received 12 sessions of AOT along with usual physiotherapy interventions thrice a week for four weeks, which included 45 minutes of each session. After four weeks of intervention, the patient demonstrated significant improvement in depression, cognition, and balance, whereas the BDI score declined from moderate 21/63 to mild 15/63 level of depression. The Mini-Cog score improved from 2/5 to 4/5, and the BBS score increased from 18/56 to 37/56. It is concluded that AOT along with usual physiotherapy intervention helps to reduce depression, improve cognition and balance of people with AD.
基金supported by the Philosophy and Social Sciences Planning Project of Guangdong Province of China(GD23XGL099)the Guangdong General Universities Young Innovative Talents Project(2023KQNCX247)the Research Project of Shanwei Institute of Technology(SWKT22-019).
文摘Laboratory safety is a critical area of broad societal concern,particularly in the detection of abnormal actions.To enhance the efficiency and accuracy of detecting such actions,this paper introduces a novel method called TubeRAPT(Tubelet Transformer based onAdapter and Prefix TrainingModule).Thismethod primarily comprises three key components:the TubeR network,an adaptive clustering attention mechanism,and a prefix training module.These components work in synergy to address the challenge of knowledge preservation in models pretrained on large datasets while maintaining training efficiency.The TubeR network serves as the backbone for spatio-temporal feature extraction,while the adaptive clustering attention mechanism refines the focus on relevant information.The prefix training module facilitates efficient fine-tuning and knowledge transfer.Experimental results demonstrate the effectiveness of TubeRAPT,achieving a 68.44%mean Average Precision(mAP)on the CLA(Crazy LabActivity)small-scale dataset,marking a significant improvement of 1.53%over the previous TubeR method.This research not only showcases the potential applications of TubeRAPT in the field of abnormal action detection but also offers innovative ideas and technical support for the future development of laboratory safety monitoring technologies.The proposed method has implications for improving safety management systems in various laboratory environments,potentially reducing accidents and enhancing overall workplace safety.
基金supported in part by the 2023 Key Supported Project of the 14th Five Year Plan for Education and Science in Hunan Province with No.ND230795.
文摘In recent years,skeleton-based action recognition has made great achievements in Computer Vision.A graph convolutional network(GCN)is effective for action recognition,modelling the human skeleton as a spatio-temporal graph.Most GCNs define the graph topology by physical relations of the human joints.However,this predefined graph ignores the spatial relationship between non-adjacent joint pairs in special actions and the behavior dependence between joint pairs,resulting in a low recognition rate for specific actions with implicit correlation between joint pairs.In addition,existing methods ignore the trend correlation between adjacent frames within an action and context clues,leading to erroneous action recognition with similar poses.Therefore,this study proposes a learnable GCN based on behavior dependence,which considers implicit joint correlation by constructing a dynamic learnable graph with extraction of specific behavior dependence of joint pairs.By using the weight relationship between the joint pairs,an adaptive model is constructed.It also designs a self-attention module to obtain their inter-frame topological relationship for exploring the context of actions.Combining the shared topology and the multi-head self-attention map,the module obtains the context-based clue topology to update the dynamic graph convolution,achieving accurate recognition of different actions with similar poses.Detailed experiments on public datasets demonstrate that the proposed method achieves better results and realizes higher quality representation of actions under various evaluation protocols compared to state-of-the-art methods.
基金supported by the National Natural Science Foundation of China(the Key Project,52131201Science Fund for Creative Research Groups,52221005)+1 种基金the China Scholarship Councilthe Joint Laboratory for Internet of Vehicles,Ministry of Education–China MOBILE Communications Corporation。
文摘This study presents a general optimal trajectory planning(GOTP)framework for autonomous vehicles(AVs)that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently.Firstly,we employ the fifth-order Bezier curve to generate and smooth the reference path along the road centerline.Cartesian coordinates are then transformed to achieve the curvature continuity of the generated curve.Considering the road constraints and vehicle dynamics,limited polynomial candidate trajectories are generated and smoothed in a curvilinear coordinate system.Furthermore,in selecting the optimal trajectory,we develop a unified and auto-tune objective function based on the principle of least action by employing AVs to simulate drivers’behavior and summarizing their manipulation characteristics of“seeking benefits and avoiding losses.”Finally,by integrating the idea of receding-horizon optimization,the proposed framework is achieved by considering dynamic multi-performance objectives and selecting trajectories that satisfy feasibility,optimality,and adaptability.Extensive simulations and experiments are performed,and the results demonstrate the framework’s feasibility and effectiveness,which avoids both dynamic and static obstacles and applies to various scenarios with multi-source interactive traffic participants.Moreover,we prove that the proposed method can guarantee real-time planning and safety requirements compared to drivers’manipulation.
基金the National Natural Science Foundation of China under Grant No.62072255.
文摘Recognition of human gesture actions is a challenging issue due to the complex patterns in both visual andskeletal features. Existing gesture action recognition (GAR) methods typically analyze visual and skeletal data,failing to meet the demands of various scenarios. Furthermore, multi-modal approaches lack the versatility toefficiently process both uniformand disparate input patterns.Thus, in this paper, an attention-enhanced pseudo-3Dresidual model is proposed to address the GAR problem, called HgaNets. This model comprises two independentcomponents designed formodeling visual RGB (red, green and blue) images and 3Dskeletal heatmaps, respectively.More specifically, each component consists of two main parts: 1) a multi-dimensional attention module forcapturing important spatial, temporal and feature information in human gestures;2) a spatiotemporal convolutionmodule that utilizes pseudo-3D residual convolution to characterize spatiotemporal features of gestures. Then,the output weights of the two components are fused to generate the recognition results. Finally, we conductedexperiments on four datasets to assess the efficiency of the proposed model. The results show that the accuracy onfour datasets reaches 85.40%, 91.91%, 94.70%, and 95.30%, respectively, as well as the inference time is 0.54 s andthe parameters is 2.74M. These findings highlight that the proposed model outperforms other existing approachesin terms of recognition accuracy.
基金funded by the National Natural Science Foundation of China(No.42372331)the Henan Excellent Youth Science Fund Project(No.242300421145)the Colleges and Universities Youth and Innovation Science and Technology Support Plan of Shandong Province(No.2021KJ024).
文摘Water decoupling charge blasting excels in rock breaking,relying on its uniform pressure transmission and low energy dissipation.The water decoupling coefficients can adjust the contributions of the stress wave and quasi-static pressure.However,the quantitative relationship between the two contributions is unclear,and it is difficult to provide reasonable theoretical support for the design of water decoupling blasting.In this study,a theoretical model of blasting fracturing partitioning is established.The mechanical mechanism and determination method of the optimal decoupling coefficient are obtained.The reliability is verified through model experiments and a field test.The results show that with the increasing of decoupling coefficient,the rock breaking ability of blasting dynamic action decreases,while quasi-static action increases and then decreases.The ability of quasi-static action to wedge into cracks changes due to the spatial adjustment of the blast hole and crushed zone.The quasi-static action plays a leading role in the fracturing range,determining an optimal decoupling coefficient.The optimal water decoupling coefficient is not a fixed value,which can be obtained by the proposed theoretical model.Compared with the theoretical results,the maximum error in the model experiment results is 8.03%,and the error in the field test result is 3.04%.