Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the correlation coefficients of gray values is consistent in an original video, while ...Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the correlation coefficients of gray values is consistent in an original video, while in forgeries the consistency will be destroyed. We first extract the consistency of correlation coefficients of gray values (CCCoGV for short) after normalization and quantization as distinguishing feature to identify interframe forgeries. Then we test the CCCoGV in a large database with the help of SVM (Support Vector Machine). Experimental results show that the proposed method is efficient in classifying original videos and forgeries. Furthermore, the proposed method performs also pretty well in classifying frame insertion and frame deletion forgeries.展开更多
Human pose estimation aims to localize the body joints from image or video data.With the development of deeplearning,pose estimation has become a hot research topic in the field of computer vision.In recent years,huma...Human pose estimation aims to localize the body joints from image or video data.With the development of deeplearning,pose estimation has become a hot research topic in the field of computer vision.In recent years,humanpose estimation has achieved great success in multiple fields such as animation and sports.However,to obtainaccurate positioning results,existing methods may suffer from large model sizes,a high number of parameters,and increased complexity,leading to high computing costs.In this paper,we propose a new lightweight featureencoder to construct a high-resolution network that reduces the number of parameters and lowers the computingcost.We also introduced a semantic enhancement module that improves global feature extraction and networkperformance by combining channel and spatial dimensions.Furthermore,we propose a dense connected spatialpyramid pooling module to compensate for the decrease in image resolution and information loss in the network.Finally,ourmethod effectively reduces the number of parameters and complexitywhile ensuring high performance.Extensive experiments show that our method achieves a competitive performance while dramatically reducing thenumber of parameters,and operational complexity.Specifically,our method can obtain 89.9%AP score on MPIIVAL,while the number of parameters and the complexity of operations were reduced by 41%and 36%,respectively.展开更多
Human pose estimation is a basic and critical task in the field of computer vision that involves determining the position(or spatial coordinates)of the joints of the human body in a given image or video.It is widely u...Human pose estimation is a basic and critical task in the field of computer vision that involves determining the position(or spatial coordinates)of the joints of the human body in a given image or video.It is widely used in motion analysis,medical evaluation,and behavior monitoring.In this paper,the authors propose a method for multi-view human pose estimation.Two image sensors were placed orthogonally with respect to each other to capture the pose of the subject as they moved,and this yielded accurate and comprehensive results of three-dimensional(3D)motion reconstruction that helped capture their multi-directional poses.Following this,we propose a method based on 3D pose estimation to assess the similarity of the features of motion of patients with motor dysfunction by comparing differences between their range of motion and that of normal subjects.We converted these differences into Fugl–Meyer assessment(FMA)scores in order to quantify them.Finally,we implemented the proposed method in the Unity framework,and built a Virtual Reality platform that provides users with human–computer interaction to make the task more enjoyable for them and ensure their active participation in the assessment process.The goal is to provide a suitable means of assessing movement disorders without requiring the immediate supervision of a physician.展开更多
Electric power training is essential for ensuring the safety and reliability of the system.In this study,we introduce a novel Abnormal Action Recognition(AAR)system that utilizes a Lightweight Pose Estimation Network(...Electric power training is essential for ensuring the safety and reliability of the system.In this study,we introduce a novel Abnormal Action Recognition(AAR)system that utilizes a Lightweight Pose Estimation Network(LPEN)to efficiently and effectively detect abnormal fall-down and trespass incidents in electric power training scenarios.The LPEN network,comprising three stages—MobileNet,Initial Stage,and Refinement Stage—is employed to swiftly extract image features,detect human key points,and refine them for accurate analysis.Subsequently,a Pose-aware Action Analysis Module(PAAM)captures the positional coordinates of human skeletal points in each frame.Finally,an Abnormal Action Inference Module(AAIM)evaluates whether abnormal fall-down or unauthorized trespass behavior is occurring.For fall-down recognition,three criteria—falling speed,main angles of skeletal points,and the person’s bounding box—are considered.To identify unauthorized trespass,emphasis is placed on the position of the ankles.Extensive experiments validate the effectiveness and efficiency of the proposed system in ensuring the safety and reliability of electric power training.展开更多
Six degrees of freedom(6DoF)input interfaces are essential formanipulating virtual objects through translation or rotation in three-dimensional(3D)space.A traditional outside-in tracking controller requires the instal...Six degrees of freedom(6DoF)input interfaces are essential formanipulating virtual objects through translation or rotation in three-dimensional(3D)space.A traditional outside-in tracking controller requires the installation of expensive hardware in advance.While inside-out tracking controllers have been proposed,they often suffer from limitations such as interaction limited to the tracking range of the sensor(e.g.,a sensor on the head-mounted display(HMD))or the need for pose value modification to function as an input interface(e.g.,a sensor on the controller).This study investigates 6DoF pose estimation methods without restricting the tracking range,using a smartphone as a controller in augmented reality(AR)environments.Our approach involves proposing methods for estimating the initial pose of the controller and correcting the pose using an inside-out tracking approach.In addition,seven pose estimation algorithms were presented as candidates depending on the tracking range of the device sensor,the tracking method(e.g.,marker recognition,visual-inertial odometry(VIO)),and whether modification of the initial pose is necessary.Through two experiments(discrete and continuous data),the performance of the algorithms was evaluated.The results demonstrate enhanced final pose accuracy achieved by correcting the initial pose.Furthermore,the importance of selecting the tracking algorithm based on the tracking range of the devices and the actual input value of the 3D interaction was emphasized.展开更多
Identifying faces in non-frontal poses presents a significant challenge for face recognition(FR)systems.In this study,we delved into the impact of yaw pose variations on these systems and devised a robust method for d...Identifying faces in non-frontal poses presents a significant challenge for face recognition(FR)systems.In this study,we delved into the impact of yaw pose variations on these systems and devised a robust method for detecting faces across a wide range of angles from 0°to±90°.We initially selected the most suitable feature vector size by integrating the Dlib,FaceNet(Inception-v2),and“Support Vector Machines(SVM)”+“K-nearest neighbors(KNN)”algorithms.To train and evaluate this feature vector,we used two datasets:the“Labeled Faces in the Wild(LFW)”benchmark data and the“Robust Shape-Based FR System(RSBFRS)”real-time data,which contained face images with varying yaw poses.After selecting the best feature vector,we developed a real-time FR system to handle yaw poses.The proposed FaceNet architecture achieved recognition accuracies of 99.7%and 99.8%for the LFW and RSBFRS datasets,respectively,with 128 feature vector dimensions and minimum Euclidean distance thresholds of 0.06 and 0.12.The FaceNet+SVM and FaceNet+KNN classifiers achieved classification accuracies of 99.26%and 99.44%,respectively.The 128-dimensional embedding vector showed the highest recognition rate among all dimensions.These results demonstrate the effectiveness of our proposed approach in enhancing FR accuracy,particularly in real-world scenarios with varying yaw poses.展开更多
文摘Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the correlation coefficients of gray values is consistent in an original video, while in forgeries the consistency will be destroyed. We first extract the consistency of correlation coefficients of gray values (CCCoGV for short) after normalization and quantization as distinguishing feature to identify interframe forgeries. Then we test the CCCoGV in a large database with the help of SVM (Support Vector Machine). Experimental results show that the proposed method is efficient in classifying original videos and forgeries. Furthermore, the proposed method performs also pretty well in classifying frame insertion and frame deletion forgeries.
基金the National Natural Science Foundation of China(Grant Number 62076246).
文摘Human pose estimation aims to localize the body joints from image or video data.With the development of deeplearning,pose estimation has become a hot research topic in the field of computer vision.In recent years,humanpose estimation has achieved great success in multiple fields such as animation and sports.However,to obtainaccurate positioning results,existing methods may suffer from large model sizes,a high number of parameters,and increased complexity,leading to high computing costs.In this paper,we propose a new lightweight featureencoder to construct a high-resolution network that reduces the number of parameters and lowers the computingcost.We also introduced a semantic enhancement module that improves global feature extraction and networkperformance by combining channel and spatial dimensions.Furthermore,we propose a dense connected spatialpyramid pooling module to compensate for the decrease in image resolution and information loss in the network.Finally,ourmethod effectively reduces the number of parameters and complexitywhile ensuring high performance.Extensive experiments show that our method achieves a competitive performance while dramatically reducing thenumber of parameters,and operational complexity.Specifically,our method can obtain 89.9%AP score on MPIIVAL,while the number of parameters and the complexity of operations were reduced by 41%and 36%,respectively.
基金This work was supported by grants fromthe Natural Science Foundation of Hebei Province,under Grant No.F2021202021the S&T Program of Hebei,under Grant No.22375001Dthe National Key R&D Program of China,under Grant No.2019YFB1312500.
文摘Human pose estimation is a basic and critical task in the field of computer vision that involves determining the position(or spatial coordinates)of the joints of the human body in a given image or video.It is widely used in motion analysis,medical evaluation,and behavior monitoring.In this paper,the authors propose a method for multi-view human pose estimation.Two image sensors were placed orthogonally with respect to each other to capture the pose of the subject as they moved,and this yielded accurate and comprehensive results of three-dimensional(3D)motion reconstruction that helped capture their multi-directional poses.Following this,we propose a method based on 3D pose estimation to assess the similarity of the features of motion of patients with motor dysfunction by comparing differences between their range of motion and that of normal subjects.We converted these differences into Fugl–Meyer assessment(FMA)scores in order to quantify them.Finally,we implemented the proposed method in the Unity framework,and built a Virtual Reality platform that provides users with human–computer interaction to make the task more enjoyable for them and ensure their active participation in the assessment process.The goal is to provide a suitable means of assessing movement disorders without requiring the immediate supervision of a physician.
基金supportted by Natural Science Foundation of Jiangsu Province(No.BK20230696).
文摘Electric power training is essential for ensuring the safety and reliability of the system.In this study,we introduce a novel Abnormal Action Recognition(AAR)system that utilizes a Lightweight Pose Estimation Network(LPEN)to efficiently and effectively detect abnormal fall-down and trespass incidents in electric power training scenarios.The LPEN network,comprising three stages—MobileNet,Initial Stage,and Refinement Stage—is employed to swiftly extract image features,detect human key points,and refine them for accurate analysis.Subsequently,a Pose-aware Action Analysis Module(PAAM)captures the positional coordinates of human skeletal points in each frame.Finally,an Abnormal Action Inference Module(AAIM)evaluates whether abnormal fall-down or unauthorized trespass behavior is occurring.For fall-down recognition,three criteria—falling speed,main angles of skeletal points,and the person’s bounding box—are considered.To identify unauthorized trespass,emphasis is placed on the position of the ankles.Extensive experiments validate the effectiveness and efficiency of the proposed system in ensuring the safety and reliability of electric power training.
文摘Six degrees of freedom(6DoF)input interfaces are essential formanipulating virtual objects through translation or rotation in three-dimensional(3D)space.A traditional outside-in tracking controller requires the installation of expensive hardware in advance.While inside-out tracking controllers have been proposed,they often suffer from limitations such as interaction limited to the tracking range of the sensor(e.g.,a sensor on the head-mounted display(HMD))or the need for pose value modification to function as an input interface(e.g.,a sensor on the controller).This study investigates 6DoF pose estimation methods without restricting the tracking range,using a smartphone as a controller in augmented reality(AR)environments.Our approach involves proposing methods for estimating the initial pose of the controller and correcting the pose using an inside-out tracking approach.In addition,seven pose estimation algorithms were presented as candidates depending on the tracking range of the device sensor,the tracking method(e.g.,marker recognition,visual-inertial odometry(VIO)),and whether modification of the initial pose is necessary.Through two experiments(discrete and continuous data),the performance of the algorithms was evaluated.The results demonstrate enhanced final pose accuracy achieved by correcting the initial pose.Furthermore,the importance of selecting the tracking algorithm based on the tracking range of the devices and the actual input value of the 3D interaction was emphasized.
基金funding for the project,excluding research publication,from the Board of Research in Nuclear Sciences(BRNS)under Grant Number 59/14/05/2019/BRNS.
文摘Identifying faces in non-frontal poses presents a significant challenge for face recognition(FR)systems.In this study,we delved into the impact of yaw pose variations on these systems and devised a robust method for detecting faces across a wide range of angles from 0°to±90°.We initially selected the most suitable feature vector size by integrating the Dlib,FaceNet(Inception-v2),and“Support Vector Machines(SVM)”+“K-nearest neighbors(KNN)”algorithms.To train and evaluate this feature vector,we used two datasets:the“Labeled Faces in the Wild(LFW)”benchmark data and the“Robust Shape-Based FR System(RSBFRS)”real-time data,which contained face images with varying yaw poses.After selecting the best feature vector,we developed a real-time FR system to handle yaw poses.The proposed FaceNet architecture achieved recognition accuracies of 99.7%and 99.8%for the LFW and RSBFRS datasets,respectively,with 128 feature vector dimensions and minimum Euclidean distance thresholds of 0.06 and 0.12.The FaceNet+SVM and FaceNet+KNN classifiers achieved classification accuracies of 99.26%and 99.44%,respectively.The 128-dimensional embedding vector showed the highest recognition rate among all dimensions.These results demonstrate the effectiveness of our proposed approach in enhancing FR accuracy,particularly in real-world scenarios with varying yaw poses.