Data Matrix(DM)codes have been widely used in industrial production.The reading of DM code usually includes positioning and decoding.Accurate positioning is a prerequisite for successful decoding.Traditional image pro...Data Matrix(DM)codes have been widely used in industrial production.The reading of DM code usually includes positioning and decoding.Accurate positioning is a prerequisite for successful decoding.Traditional image processing methods have poor adaptability to pollution and complex backgrounds.Although deep learning-based methods can automatically extract features,the bounding boxes cannot entirely fit the contour of the code.Further image processing methods are required for precise positioning,which will reduce efficiency.Because of the above problems,a CenterNet-based DM code key point detection network is proposed,which can directly obtain the four key points of the DM code.Compared with the existing methods,the degree of fitness is higher,which is conducive to direct decoding.To further improve the positioning accuracy,an enhanced loss function is designed,including DM code key point heatmap loss,standard DM code projection loss,and polygon Intersection-over-Union(IoU)loss,which is beneficial for the network to learn the spatial geometric characteristics of DM code.The experiment is carried out on the self-made DM code key point detection dataset,including pollution,complex background,small objects,etc.,which uses the Average Precision(AP)of the common object detection metric as the evaluation metric.AP reaches 95.80%,and Frames Per Second(FPS)gets 88.12 on the test set of the proposed dataset,which can achieve real-time performance in practical applications.展开更多
The human pose paradigm is estimated using a transformer-based multi-branch multidimensional directed the three-dimensional(3D)method that takes into account self-occlusion,badly posedness,and a lack of depth data in ...The human pose paradigm is estimated using a transformer-based multi-branch multidimensional directed the three-dimensional(3D)method that takes into account self-occlusion,badly posedness,and a lack of depth data in the per-frame 3D posture estimation from two-dimensional(2D)mapping to 3D mapping.Firstly,by examining the relationship between the movements of different bones in the human body,four virtual skeletons are proposed to enhance the cyclic constraints of limb joints.Then,multiple parameters describing the skeleton are fused and projected into a high-dimensional space.Utilizing a multi-branch network,motion features between bones and overall motion features are extracted to mitigate the drift error in the estimation results.Furthermore,the estimated relative depth is projected into 3D space,and the error is calculated against real 3D data,forming a loss function along with the relative depth error.This article adopts the average joint pixel error as the primary performance metric.Compared to the benchmark approach,the estimation findings indicate an increase in average precision of 1.8 mm within the Human3.6M sample.展开更多
Human pose recognition and estimation in video is pervasive.However,the process noise and local occlusion bring great challenge to pose recognition.In this paper,we introduce the Kalman filter into pose recognition to...Human pose recognition and estimation in video is pervasive.However,the process noise and local occlusion bring great challenge to pose recognition.In this paper,we introduce the Kalman filter into pose recognition to reduce noise and solve local occlusion problem.The core of pose recognition in video is the fast detection of key points and the calculation of human steering angles.Thus,we first build a human key point detection model.Frame skipping is performed based on the Hamming distance of the hash value of every two adjacent frames in video.Noise reduction is performed on key point coordinates with the Kalman filter.To calculate the human steering angle,current state information of key points is predicted using the optimal estimation of key points at the previous time.Then human steering angle can be calculated based on current and previous state information.The improved SENet,NLNet and GCNet modules are integrated into key point detection model for improving accuracy.Tests are also given to illustrate the effectiveness of the proposed algorithm.展开更多
Human object interaction(HOI)recognition plays an important role in the designing of surveillance and monitoring systems for healthcare,sports,education,and public areas.It involves localizing the human and object tar...Human object interaction(HOI)recognition plays an important role in the designing of surveillance and monitoring systems for healthcare,sports,education,and public areas.It involves localizing the human and object targets and then identifying the interactions between them.However,it is a challenging task that highly depends on the extraction of robust and distinctive features from the targets and the use of fast and efficient classifiers.Hence,the proposed system offers an automated body-parts-based solution for HOI recognition.This system uses RGB(red,green,blue)images as input and segments the desired parts of the images through a segmentation technique based on the watershed algorithm.Furthermore,a convex hullbased approach for extracting key body parts has also been introduced.After identifying the key body parts,two types of features are extracted.Moreover,the entire feature vector is reduced using a dimensionality reduction technique called t-SNE(t-distributed stochastic neighbor embedding).Finally,a multinomial logistic regression classifier is utilized for identifying class labels.A large publicly available dataset,MPII(Max Planck Institute Informatics)Human Pose,has been used for system evaluation.The results prove the validity of the proposed system as it achieved 87.5%class recognition accuracy.展开更多
基金funded by the Youth Project of National Natural Science Foundation of China(52002031)the General Project of Shaanxi Province Science and Technology Development Planned Project(2023-JC-YB-600)+1 种基金Postgraduate Education and Teaching Research University-Level Project of Central University Project(300103131033)the Transportation Research Project of Shaanxi Transport Department(23-108 K).
文摘Data Matrix(DM)codes have been widely used in industrial production.The reading of DM code usually includes positioning and decoding.Accurate positioning is a prerequisite for successful decoding.Traditional image processing methods have poor adaptability to pollution and complex backgrounds.Although deep learning-based methods can automatically extract features,the bounding boxes cannot entirely fit the contour of the code.Further image processing methods are required for precise positioning,which will reduce efficiency.Because of the above problems,a CenterNet-based DM code key point detection network is proposed,which can directly obtain the four key points of the DM code.Compared with the existing methods,the degree of fitness is higher,which is conducive to direct decoding.To further improve the positioning accuracy,an enhanced loss function is designed,including DM code key point heatmap loss,standard DM code projection loss,and polygon Intersection-over-Union(IoU)loss,which is beneficial for the network to learn the spatial geometric characteristics of DM code.The experiment is carried out on the self-made DM code key point detection dataset,including pollution,complex background,small objects,etc.,which uses the Average Precision(AP)of the common object detection metric as the evaluation metric.AP reaches 95.80%,and Frames Per Second(FPS)gets 88.12 on the test set of the proposed dataset,which can achieve real-time performance in practical applications.
基金supported by the Medical Special Cultivation Project of Anhui University of Science and Technology(Grant No.YZ2023H2B013)the Anhui Provincial Key Research and Development Project(Grant No.2022i01020015)the Open Project of Key Laboratory of Conveyance Equipment(East China Jiaotong University),Ministry of Education(KLCE2022-01).
文摘The human pose paradigm is estimated using a transformer-based multi-branch multidimensional directed the three-dimensional(3D)method that takes into account self-occlusion,badly posedness,and a lack of depth data in the per-frame 3D posture estimation from two-dimensional(2D)mapping to 3D mapping.Firstly,by examining the relationship between the movements of different bones in the human body,four virtual skeletons are proposed to enhance the cyclic constraints of limb joints.Then,multiple parameters describing the skeleton are fused and projected into a high-dimensional space.Utilizing a multi-branch network,motion features between bones and overall motion features are extracted to mitigate the drift error in the estimation results.Furthermore,the estimated relative depth is projected into 3D space,and the error is calculated against real 3D data,forming a loss function along with the relative depth error.This article adopts the average joint pixel error as the primary performance metric.Compared to the benchmark approach,the estimation findings indicate an increase in average precision of 1.8 mm within the Human3.6M sample.
基金This work was supported by the National Natural Science Foundation of China(Nos.72101026,61621063)the State Key Laboratory of Intelligent Control and Decision of Complex Systems.
文摘Human pose recognition and estimation in video is pervasive.However,the process noise and local occlusion bring great challenge to pose recognition.In this paper,we introduce the Kalman filter into pose recognition to reduce noise and solve local occlusion problem.The core of pose recognition in video is the fast detection of key points and the calculation of human steering angles.Thus,we first build a human key point detection model.Frame skipping is performed based on the Hamming distance of the hash value of every two adjacent frames in video.Noise reduction is performed on key point coordinates with the Kalman filter.To calculate the human steering angle,current state information of key points is predicted using the optimal estimation of key points at the previous time.Then human steering angle can be calculated based on current and previous state information.The improved SENet,NLNet and GCNet modules are integrated into key point detection model for improving accuracy.Tests are also given to illustrate the effectiveness of the proposed algorithm.
基金This research was supported by a grant(2021R1F1A1063634)of the Basic Science Research Program through the National Research Foundation(NRF)funded by the Ministry of Education,Republic of Korea.
文摘Human object interaction(HOI)recognition plays an important role in the designing of surveillance and monitoring systems for healthcare,sports,education,and public areas.It involves localizing the human and object targets and then identifying the interactions between them.However,it is a challenging task that highly depends on the extraction of robust and distinctive features from the targets and the use of fast and efficient classifiers.Hence,the proposed system offers an automated body-parts-based solution for HOI recognition.This system uses RGB(red,green,blue)images as input and segments the desired parts of the images through a segmentation technique based on the watershed algorithm.Furthermore,a convex hullbased approach for extracting key body parts has also been introduced.After identifying the key body parts,two types of features are extracted.Moreover,the entire feature vector is reduced using a dimensionality reduction technique called t-SNE(t-distributed stochastic neighbor embedding).Finally,a multinomial logistic regression classifier is utilized for identifying class labels.A large publicly available dataset,MPII(Max Planck Institute Informatics)Human Pose,has been used for system evaluation.The results prove the validity of the proposed system as it achieved 87.5%class recognition accuracy.