Human pose estimation aims to localize the body joints from image or video data.With the development of deeplearning,pose estimation has become a hot research topic in the field of computer vision.In recent years,huma...Human pose estimation aims to localize the body joints from image or video data.With the development of deeplearning,pose estimation has become a hot research topic in the field of computer vision.In recent years,humanpose estimation has achieved great success in multiple fields such as animation and sports.However,to obtainaccurate positioning results,existing methods may suffer from large model sizes,a high number of parameters,and increased complexity,leading to high computing costs.In this paper,we propose a new lightweight featureencoder to construct a high-resolution network that reduces the number of parameters and lowers the computingcost.We also introduced a semantic enhancement module that improves global feature extraction and networkperformance by combining channel and spatial dimensions.Furthermore,we propose a dense connected spatialpyramid pooling module to compensate for the decrease in image resolution and information loss in the network.Finally,ourmethod effectively reduces the number of parameters and complexitywhile ensuring high performance.Extensive experiments show that our method achieves a competitive performance while dramatically reducing thenumber of parameters,and operational complexity.Specifically,our method can obtain 89.9%AP score on MPIIVAL,while the number of parameters and the complexity of operations were reduced by 41%and 36%,respectively.展开更多
Human pose estimation is a critical research area in the field of computer vision,playing a significant role in applications such as human-computer interaction,behavior analysis,and action recognition.In this paper,we...Human pose estimation is a critical research area in the field of computer vision,playing a significant role in applications such as human-computer interaction,behavior analysis,and action recognition.In this paper,we propose a U-shaped keypoint detection network(DAUNet)based on an improved ResNet subsampling structure and spatial grouping mechanism.This network addresses key challenges in traditional methods,such as information loss,large network redundancy,and insufficient sensitivity to low-resolution features.DAUNet is composed of three main components.First,we introduce an improved BottleNeck block that employs partial convolution and strip pooling to reduce computational load and mitigate feature loss.Second,after upsampling,the network eliminates redundant features,improving the overall efficiency.Finally,a lightweight spatial grouping attention mechanism is applied to enhance low-resolution semantic features within the feature map,allowing for better restoration of the original image size and higher accuracy.Experimental results demonstrate that DAUNet achieves superior accuracy compared to most existing keypoint detection models,with a mean PCKh@0.5 score of 91.6%on the MPII dataset and an AP of 76.1%on the COCO dataset.Moreover,real-world experiments further validate the robustness and generalizability of DAUNet for detecting human bodies in unknown environments,highlighting its potential for broader applications.展开更多
Human pose estimation is a basic and critical task in the field of computer vision that involves determining the position(or spatial coordinates)of the joints of the human body in a given image or video.It is widely u...Human pose estimation is a basic and critical task in the field of computer vision that involves determining the position(or spatial coordinates)of the joints of the human body in a given image or video.It is widely used in motion analysis,medical evaluation,and behavior monitoring.In this paper,the authors propose a method for multi-view human pose estimation.Two image sensors were placed orthogonally with respect to each other to capture the pose of the subject as they moved,and this yielded accurate and comprehensive results of three-dimensional(3D)motion reconstruction that helped capture their multi-directional poses.Following this,we propose a method based on 3D pose estimation to assess the similarity of the features of motion of patients with motor dysfunction by comparing differences between their range of motion and that of normal subjects.We converted these differences into Fugl–Meyer assessment(FMA)scores in order to quantify them.Finally,we implemented the proposed method in the Unity framework,and built a Virtual Reality platform that provides users with human–computer interaction to make the task more enjoyable for them and ensure their active participation in the assessment process.The goal is to provide a suitable means of assessing movement disorders without requiring the immediate supervision of a physician.展开更多
BACKGROUND: It is not possible to reconstruct the inner structure of the spinal cord, such as gray matter and spinal tracts, from the Visual Human Project database or CT and MRI databases, due to low image resolution...BACKGROUND: It is not possible to reconstruct the inner structure of the spinal cord, such as gray matter and spinal tracts, from the Visual Human Project database or CT and MRI databases, due to low image resolution and contrast in macrosection images. OBJECTIVE: To explore a semi-automatic computerized three-dimensional (3D) reconstruction of human spinal cord based on histological serial sections, in order to solve issues such as low contrast. DESIGN, TIME AND SETTING: An experimental study combining serial section techniques and 3D reconstruction, performed in the laboratory of Human Anatomy and Histoembryology at the Medical School of Nantong University during January to April 2008. SETTING: Department of Anatomy, Institute of Neurobiology, Jiangsu Province Key Laboratory of Neural Regeneration, Laboratory of Image Engineering. MATERIALS: A human lumbar spinal cord segment from fresh autopsy material of an adult male. METHODS: After 4% paraformaldehyde fixation for three days, serial sections of the lumbar spinal cord were cut on a Leica cryostat and mounted on slides in sequence, with eight sections aligned separately on each slide. All sections were stained with Luxol Fast Blue to reveal myelin sheaths. After gradient dehydration and clearing, the stained slides were coverslipped. Sections were observed and images recorded under a light microscope using a digital camera. Six images were acquired at x25 magnification and automatically stitched into a complete section image. After all serial images were obtained, 96 complete serial images of the human lumbar cord segment were automatically processed with "Curves", "Autocontrast", "Gray scale 8 bit", "Invert", "Image resize to 50%" steps using Photoshop 7.0 software. All images were added in order into 3D-DOCTOR 4.0 software as a stack, where serial images were automatically realigned with neighboring images and semi-automatically segmented for white matter and gray matter. Finally, simple surface and volume reconstruction were completed on a personal computer. The reconstructed human lumbar spinal cord segment was interactively observed, cut, and measured. MAIN OUTCOME MEASURES: The reconstructed human lumbar spinal cord segment. RESULTS: Compared with serial images obtained from other image modalities, such as CT, MRI, and macrosections from The Visual Human Project database, the Luxol Fast Blue stained histological serial section images exhibited higher resolution and contrast between gray and white matter. Image processing and 3D reconstruction steps were semi-automatically performed with related software. The 3D reconstructed human lumbar cord segment were observed, cut, and measured on a PC. CONCLUSION: A semi-automatically computerized method, based on histological serial sections, is an effective way to 3D-reconstruct the human spinal cord.展开更多
Cosmetic safety evaluation employs a series of toxicological tests, on both qualitative and quantitative levels, to assess the potential risks for the daily use of selected cosmetic ingredients and final products. Tra...Cosmetic safety evaluation employs a series of toxicological tests, on both qualitative and quantitative levels, to assess the potential risks for the daily use of selected cosmetic ingredients and final products. Traditionally, safety evaluation of cosmetics uses animal tests. With the development of in vitro science and the 3R (Reduction, Replacement and Refinement) principle, three-dimensional reconstructed human epidermis (3D-RHE) models have been developed and widely applied in cosmetic safety evaluation. Reconstructed human skin models possess anatomy and metabolism biology similar to real human tissue. This paper reviews the current application of 3D-RHE models in the safety evaluation of skin irritation, eye irritation, phototoxicity and genotoxicity potential of cosmetic ingredients/formulas. The advantages and disadvantages of using skin models are also discussed, and comments and suggestions are given for its future development.展开更多
Objective To establish a 3D atlas of the lenticular nuclei and its subnucleus with the cryosection images of the male from "Atlas of Chinese Visible Human". Methods The lenticular nuclei and its subnucleus w...Objective To establish a 3D atlas of the lenticular nuclei and its subnucleus with the cryosection images of the male from "Atlas of Chinese Visible Human". Methods The lenticular nuclei and its subnucleus were segmented from the cryosection images and reconstructed with the software展开更多
A multi-residual module stacked hourglass network(MRSH)was proposed to improve the accuracy and robustness of human body pose estimation.The network uses multiple hourglass sub-networks and three new residual modules....A multi-residual module stacked hourglass network(MRSH)was proposed to improve the accuracy and robustness of human body pose estimation.The network uses multiple hourglass sub-networks and three new residual modules.In the hourglass sub-network,the large receptive field residual module(LRFRM)and the multi-scale residual module(MSRM)are first used to learn the spatial relationship between features and body parts at various scales.Only the improved residual module(IRM)is used when the resolution is minimized.The final network uses four stacked hourglass sub-networks,with intermediate supervision at the end of each hourglass,repeating high-low(from high resolution to low resolution)and low-high(from low resolution to high resolution)learning.The network was tested on the public datasets of Leeds sports poses(LSP)and MPII human pose.The experimental results show that the proposed network has better performance in human pose estimation.展开更多
In the new era of technology,daily human activities are becoming more challenging in terms of monitoring complex scenes and backgrounds.To understand the scenes and activities from human life logs,human-object interac...In the new era of technology,daily human activities are becoming more challenging in terms of monitoring complex scenes and backgrounds.To understand the scenes and activities from human life logs,human-object interaction(HOI)is important in terms of visual relationship detection and human pose estimation.Activities understanding and interaction recognition between human and object along with the pose estimation and interaction modeling have been explained.Some existing algorithms and feature extraction procedures are complicated including accurate detection of rare human postures,occluded regions,and unsatisfactory detection of objects,especially small-sized objects.The existing HOI detection techniques are instancecentric(object-based)where interaction is predicted between all the pairs.Such estimation depends on appearance features and spatial information.Therefore,we propose a novel approach to demonstrate that the appearance features alone are not sufficient to predict the HOI.Furthermore,we detect the human body parts by using the Gaussian Matric Model(GMM)followed by object detection using YOLO.We predict the interaction points which directly classify the interaction and pair them with densely predicted HOI vectors by using the interaction algorithm.The interactions are linked with the human and object to predict the actions.The experiments have been performed on two benchmark HOI datasets demonstrating the proposed approach.展开更多
3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimat...3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.展开更多
Human pose estimation(HPE)is a procedure for determining the structure of the body pose and it is considered a challenging issue in the computer vision(CV)communities.HPE finds its applications in several fields namel...Human pose estimation(HPE)is a procedure for determining the structure of the body pose and it is considered a challenging issue in the computer vision(CV)communities.HPE finds its applications in several fields namely activity recognition and human-computer interface.Despite the benefits of HPE,it is still a challenging process due to the variations in visual appearances,lighting,occlusions,dimensionality,etc.To resolve these issues,this paper presents a squirrel search optimization with a deep convolutional neural network for HPE(SSDCNN-HPE)technique.The major intention of the SSDCNN-HPE technique is to identify the human pose accurately and efficiently.Primarily,the video frame conversion process is performed and pre-processing takes place via bilateral filtering-based noise removal process.Then,the EfficientNet model is applied to identify the body points of a person with no problem constraints.Besides,the hyperparameter tuning of the EfficientNet model takes place by the use of the squirrel search algorithm(SSA).In the final stage,the multiclass support vector machine(M-SVM)technique was utilized for the identification and classification of human poses.The design of bilateral filtering followed by SSA based EfficientNetmodel for HPE depicts the novelty of the work.To demonstrate the enhanced outcomes of the SSDCNN-HPE approach,a series of simulations are executed.The experimental results reported the betterment of the SSDCNN-HPE system over the recent existing techniques in terms of different measures.展开更多
Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction...Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction,robot vision,etc.Though considerable improvements have been made in recent days,design of an effective and accurate action recognition model is yet a difficult process owing to the existence of different obstacles such as variations in camera angle,occlusion,background,movement speed,and so on.From the literature,it is observed that hard to deal with the temporal dimension in the action recognition process.Convolutional neural network(CNN)models could be used widely to solve this.With this motivation,this study designs a novel key point extraction with deep convolutional neural networks based pose estimation(KPE-DCNN)model for activity recognition.The KPE-DCNN technique initially converts the input video into a sequence of frames followed by a three stage process namely key point extraction,hyperparameter tuning,and pose estimation.In the keypoint extraction process an OpenPose model is designed to compute the accurate key-points in the human pose.Then,an optimal DCNN model is developed to classify the human activities label based on the extracted key points.For improving the training process of the DCNN technique,RMSProp optimizer is used to optimally adjust the hyperparameters such as learning rate,batch size,and epoch count.The experimental results tested using benchmark dataset like UCF sports dataset showed that KPE-DCNN technique is able to achieve good results compared with benchmark algorithms like CNN,DBN,SVM,STAL,T-CNN and so on.展开更多
Lots of progress has been made recently on 2 D human pose tracking with tracking-by-detection approaches. However,several challenges still remain in this area which is due to self-occlusions and the confusion between ...Lots of progress has been made recently on 2 D human pose tracking with tracking-by-detection approaches. However,several challenges still remain in this area which is due to self-occlusions and the confusion between the left and right limbs during tracking. In this work,a head orientation detection step is introduced into the tracking framework to serve as a complementary tool to assist human pose estimation. With the face orientation determined,the system can decide whether the left or right side of the human body is exactly visible and infer the state of the symmetric counterpart. By granting a higher priority for the completely visible side,the system can avoid double counting to a great extent when inferring body poses. The proposed framework is evaluated on the HumanEva dataset. The results show that it largely reduces the occurrence of double counting and distinguishes the left and right sides consistently.展开更多
Recovering human pose from RGB images and videos has drawn increasing attention in recent years owing to minimum sensor requirements and applicability in diverse fields such as human-computer interaction,robotics,vide...Recovering human pose from RGB images and videos has drawn increasing attention in recent years owing to minimum sensor requirements and applicability in diverse fields such as human-computer interaction,robotics,video analytics,and augmented reality.Although a large amount of work has been devoted to this field,3D human pose estimation based on monocular images or videos remains a very challenging task due to a variety of difficulties such as depth ambiguities,occlusion,background clutters,and lack of training data.In this survey,we summarize recent advances in monocular 3D human pose estimation.We provide a general taxonomy to cover existing approaches and analyze their capabilities and limitations.We also present a summary of extensively used datasets and metrics,and provide a quantitative comparison of some representative methods.Finally,we conclude with a discussion on realistic challenges and open problems for future research directions.展开更多
Three-dimensional (3D) human pose tracking has recently attracted more and more attention in the computer vision field. Real-time pose tracking is highly useful in various domains such as video surveillance, somatosen...Three-dimensional (3D) human pose tracking has recently attracted more and more attention in the computer vision field. Real-time pose tracking is highly useful in various domains such as video surveillance, somatosensory games, and human-computer interaction. However, vision-based pose tracking techniques usually raise privacy concerns, making human pose tracking without vision data usage an important problem. Thus, we propose using Radio Frequency Identification (RFID) as a pose tracking technique via a low-cost wearable sensing device. Although our prior work illustrated how deep learning could transfer RFID data into real-time human poses, generalization for different subjects remains challenging. This paper proposes a subject-adaptive technique to address this generalization problem. In the proposed system, termed Cycle-Pose, we leverage a cross-skeleton learning structure to improve the adaptability of the deep learning model to different human skeletons. Moreover, our novel cycle kinematic network is proposed for unpaired RFID and labeled pose data from different subjects. The Cycle-Pose system is implemented and evaluated by comparing its prototype with a traditional RFID pose tracking system. The experimental results demonstrate that Cycle-Pose can achieve lower estimation error and better subject generalization than the traditional system.展开更多
Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has at...Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has attractedmany researchers to this field. Inspired by the existing recognition systems,this paper proposes a new and efficient human-object interaction recognition(HOIR) model which is based on modeling human pose and scene featureinformation. There are different aspects involved in an interaction, includingthe humans, the objects, the various body parts of the human, and the backgroundscene. Themain objectives of this research include critically examiningthe importance of all these elements in determining the interaction, estimatinghuman pose through image foresting transform (IFT), and detecting the performedinteractions based on an optimizedmulti-feature vector. The proposedmethodology has six main phases. The first phase involves preprocessing theimages. During preprocessing stages, the videos are converted into imageframes. Then their contrast is adjusted, and noise is removed. In the secondphase, the human-object pair is detected and extracted from each image frame.The third phase involves the identification of key body parts of the detectedhumans using IFT. The fourth phase relates to three different kinds of featureextraction techniques. Then these features are combined and optimized duringthe fifth phase. The optimized vector is used to classify the interactions in thelast phase. TheMSRDaily Activity 3D dataset has been used to test this modeland to prove its efficiency. The proposed system obtains an average accuracyof 91.7% on this dataset.展开更多
In this article,a comprehensive survey of deep learning-based(DLbased)human pose estimation(HPE)that can help researchers in the domain of computer vision is presented.HPE is among the fastest-growing research domains...In this article,a comprehensive survey of deep learning-based(DLbased)human pose estimation(HPE)that can help researchers in the domain of computer vision is presented.HPE is among the fastest-growing research domains of computer vision and is used in solving several problems for human endeavours.After the detailed introduction,three different human body modes followed by the main stages of HPE and two pipelines of twodimensional(2D)HPE are presented.The details of the four components of HPE are also presented.The keypoints output format of two popular 2D HPE datasets and the most cited DL-based HPE articles from the year of breakthrough are both shown in tabular form.This study intends to highlight the limitations of published reviews and surveys respecting presenting a systematic review of the current DL-based solution to the 2D HPE model.Furthermore,a detailed and meaningful survey that will guide new and existing researchers on DL-based 2D HPE models is achieved.Finally,some future research directions in the field of HPE,such as limited data on disabled persons and multi-training DL-based models,are revealed to encourage researchers and promote the growth of HPE research.展开更多
基金the National Natural Science Foundation of China(Grant Number 62076246).
文摘Human pose estimation aims to localize the body joints from image or video data.With the development of deeplearning,pose estimation has become a hot research topic in the field of computer vision.In recent years,humanpose estimation has achieved great success in multiple fields such as animation and sports.However,to obtainaccurate positioning results,existing methods may suffer from large model sizes,a high number of parameters,and increased complexity,leading to high computing costs.In this paper,we propose a new lightweight featureencoder to construct a high-resolution network that reduces the number of parameters and lowers the computingcost.We also introduced a semantic enhancement module that improves global feature extraction and networkperformance by combining channel and spatial dimensions.Furthermore,we propose a dense connected spatialpyramid pooling module to compensate for the decrease in image resolution and information loss in the network.Finally,ourmethod effectively reduces the number of parameters and complexitywhile ensuring high performance.Extensive experiments show that our method achieves a competitive performance while dramatically reducing thenumber of parameters,and operational complexity.Specifically,our method can obtain 89.9%AP score on MPIIVAL,while the number of parameters and the complexity of operations were reduced by 41%and 36%,respectively.
基金supported by the Natural Science Foundation of Hubei Province of China under grant number 2022CFB536the National Natural Science Foundation of China under grant number 62367006the 15th Graduate Education Innovation Fund of Wuhan Institute of Technology under grant number CX2023579.
文摘Human pose estimation is a critical research area in the field of computer vision,playing a significant role in applications such as human-computer interaction,behavior analysis,and action recognition.In this paper,we propose a U-shaped keypoint detection network(DAUNet)based on an improved ResNet subsampling structure and spatial grouping mechanism.This network addresses key challenges in traditional methods,such as information loss,large network redundancy,and insufficient sensitivity to low-resolution features.DAUNet is composed of three main components.First,we introduce an improved BottleNeck block that employs partial convolution and strip pooling to reduce computational load and mitigate feature loss.Second,after upsampling,the network eliminates redundant features,improving the overall efficiency.Finally,a lightweight spatial grouping attention mechanism is applied to enhance low-resolution semantic features within the feature map,allowing for better restoration of the original image size and higher accuracy.Experimental results demonstrate that DAUNet achieves superior accuracy compared to most existing keypoint detection models,with a mean PCKh@0.5 score of 91.6%on the MPII dataset and an AP of 76.1%on the COCO dataset.Moreover,real-world experiments further validate the robustness and generalizability of DAUNet for detecting human bodies in unknown environments,highlighting its potential for broader applications.
基金This work was supported by grants fromthe Natural Science Foundation of Hebei Province,under Grant No.F2021202021the S&T Program of Hebei,under Grant No.22375001Dthe National Key R&D Program of China,under Grant No.2019YFB1312500.
文摘Human pose estimation is a basic and critical task in the field of computer vision that involves determining the position(or spatial coordinates)of the joints of the human body in a given image or video.It is widely used in motion analysis,medical evaluation,and behavior monitoring.In this paper,the authors propose a method for multi-view human pose estimation.Two image sensors were placed orthogonally with respect to each other to capture the pose of the subject as they moved,and this yielded accurate and comprehensive results of three-dimensional(3D)motion reconstruction that helped capture their multi-directional poses.Following this,we propose a method based on 3D pose estimation to assess the similarity of the features of motion of patients with motor dysfunction by comparing differences between their range of motion and that of normal subjects.We converted these differences into Fugl–Meyer assessment(FMA)scores in order to quantify them.Finally,we implemented the proposed method in the Unity framework,and built a Virtual Reality platform that provides users with human–computer interaction to make the task more enjoyable for them and ensure their active participation in the assessment process.The goal is to provide a suitable means of assessing movement disorders without requiring the immediate supervision of a physician.
基金Natural Science Research Plan forJiangsu Colleges, No.05KJB180105 Postgraduate Innovation Cultivating Projectin Jiangsu Province, No.CX07s_035z
文摘BACKGROUND: It is not possible to reconstruct the inner structure of the spinal cord, such as gray matter and spinal tracts, from the Visual Human Project database or CT and MRI databases, due to low image resolution and contrast in macrosection images. OBJECTIVE: To explore a semi-automatic computerized three-dimensional (3D) reconstruction of human spinal cord based on histological serial sections, in order to solve issues such as low contrast. DESIGN, TIME AND SETTING: An experimental study combining serial section techniques and 3D reconstruction, performed in the laboratory of Human Anatomy and Histoembryology at the Medical School of Nantong University during January to April 2008. SETTING: Department of Anatomy, Institute of Neurobiology, Jiangsu Province Key Laboratory of Neural Regeneration, Laboratory of Image Engineering. MATERIALS: A human lumbar spinal cord segment from fresh autopsy material of an adult male. METHODS: After 4% paraformaldehyde fixation for three days, serial sections of the lumbar spinal cord were cut on a Leica cryostat and mounted on slides in sequence, with eight sections aligned separately on each slide. All sections were stained with Luxol Fast Blue to reveal myelin sheaths. After gradient dehydration and clearing, the stained slides were coverslipped. Sections were observed and images recorded under a light microscope using a digital camera. Six images were acquired at x25 magnification and automatically stitched into a complete section image. After all serial images were obtained, 96 complete serial images of the human lumbar cord segment were automatically processed with "Curves", "Autocontrast", "Gray scale 8 bit", "Invert", "Image resize to 50%" steps using Photoshop 7.0 software. All images were added in order into 3D-DOCTOR 4.0 software as a stack, where serial images were automatically realigned with neighboring images and semi-automatically segmented for white matter and gray matter. Finally, simple surface and volume reconstruction were completed on a personal computer. The reconstructed human lumbar spinal cord segment was interactively observed, cut, and measured. MAIN OUTCOME MEASURES: The reconstructed human lumbar spinal cord segment. RESULTS: Compared with serial images obtained from other image modalities, such as CT, MRI, and macrosections from The Visual Human Project database, the Luxol Fast Blue stained histological serial section images exhibited higher resolution and contrast between gray and white matter. Image processing and 3D reconstruction steps were semi-automatically performed with related software. The 3D reconstructed human lumbar cord segment were observed, cut, and measured on a PC. CONCLUSION: A semi-automatically computerized method, based on histological serial sections, is an effective way to 3D-reconstruct the human spinal cord.
文摘Cosmetic safety evaluation employs a series of toxicological tests, on both qualitative and quantitative levels, to assess the potential risks for the daily use of selected cosmetic ingredients and final products. Traditionally, safety evaluation of cosmetics uses animal tests. With the development of in vitro science and the 3R (Reduction, Replacement and Refinement) principle, three-dimensional reconstructed human epidermis (3D-RHE) models have been developed and widely applied in cosmetic safety evaluation. Reconstructed human skin models possess anatomy and metabolism biology similar to real human tissue. This paper reviews the current application of 3D-RHE models in the safety evaluation of skin irritation, eye irritation, phototoxicity and genotoxicity potential of cosmetic ingredients/formulas. The advantages and disadvantages of using skin models are also discussed, and comments and suggestions are given for its future development.
文摘Objective To establish a 3D atlas of the lenticular nuclei and its subnucleus with the cryosection images of the male from "Atlas of Chinese Visible Human". Methods The lenticular nuclei and its subnucleus were segmented from the cryosection images and reconstructed with the software
基金Supported by the National Natural Science Foundation of China(61401001,61501003,61672032)。
文摘A multi-residual module stacked hourglass network(MRSH)was proposed to improve the accuracy and robustness of human body pose estimation.The network uses multiple hourglass sub-networks and three new residual modules.In the hourglass sub-network,the large receptive field residual module(LRFRM)and the multi-scale residual module(MSRM)are first used to learn the spatial relationship between features and body parts at various scales.Only the improved residual module(IRM)is used when the resolution is minimized.The final network uses four stacked hourglass sub-networks,with intermediate supervision at the end of each hourglass,repeating high-low(from high resolution to low resolution)and low-high(from low resolution to high resolution)learning.The network was tested on the public datasets of Leeds sports poses(LSP)and MPII human pose.The experimental results show that the proposed network has better performance in human pose estimation.
基金supported by Priority Research Centers Program through NRF funded by MEST(2018R1A6A1A03024003)the Grand Information Technology Research Center support program IITP-2020-2020-0-01612 supervised by the IITP by MSIT,Korea.
文摘In the new era of technology,daily human activities are becoming more challenging in terms of monitoring complex scenes and backgrounds.To understand the scenes and activities from human life logs,human-object interaction(HOI)is important in terms of visual relationship detection and human pose estimation.Activities understanding and interaction recognition between human and object along with the pose estimation and interaction modeling have been explained.Some existing algorithms and feature extraction procedures are complicated including accurate detection of rare human postures,occluded regions,and unsatisfactory detection of objects,especially small-sized objects.The existing HOI detection techniques are instancecentric(object-based)where interaction is predicted between all the pairs.Such estimation depends on appearance features and spatial information.Therefore,we propose a novel approach to demonstrate that the appearance features alone are not sufficient to predict the HOI.Furthermore,we detect the human body parts by using the Gaussian Matric Model(GMM)followed by object detection using YOLO.We predict the interaction points which directly classify the interaction and pair them with densely predicted HOI vectors by using the interaction algorithm.The interactions are linked with the human and object to predict the actions.The experiments have been performed on two benchmark HOI datasets demonstrating the proposed approach.
基金supported by the Program of Entrepreneurship and Innovation Ph.D.in Jiangsu Province(JSSCBS20211175)the School Ph.D.Talent Funding(Z301B2055)the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(21KJB520002).
文摘3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.
文摘Human pose estimation(HPE)is a procedure for determining the structure of the body pose and it is considered a challenging issue in the computer vision(CV)communities.HPE finds its applications in several fields namely activity recognition and human-computer interface.Despite the benefits of HPE,it is still a challenging process due to the variations in visual appearances,lighting,occlusions,dimensionality,etc.To resolve these issues,this paper presents a squirrel search optimization with a deep convolutional neural network for HPE(SSDCNN-HPE)technique.The major intention of the SSDCNN-HPE technique is to identify the human pose accurately and efficiently.Primarily,the video frame conversion process is performed and pre-processing takes place via bilateral filtering-based noise removal process.Then,the EfficientNet model is applied to identify the body points of a person with no problem constraints.Besides,the hyperparameter tuning of the EfficientNet model takes place by the use of the squirrel search algorithm(SSA).In the final stage,the multiclass support vector machine(M-SVM)technique was utilized for the identification and classification of human poses.The design of bilateral filtering followed by SSA based EfficientNetmodel for HPE depicts the novelty of the work.To demonstrate the enhanced outcomes of the SSDCNN-HPE approach,a series of simulations are executed.The experimental results reported the betterment of the SSDCNN-HPE system over the recent existing techniques in terms of different measures.
文摘Human Action Recognition(HAR)and pose estimation from videos have gained significant attention among research communities due to its applica-tion in several areas namely intelligent surveillance,human robot interaction,robot vision,etc.Though considerable improvements have been made in recent days,design of an effective and accurate action recognition model is yet a difficult process owing to the existence of different obstacles such as variations in camera angle,occlusion,background,movement speed,and so on.From the literature,it is observed that hard to deal with the temporal dimension in the action recognition process.Convolutional neural network(CNN)models could be used widely to solve this.With this motivation,this study designs a novel key point extraction with deep convolutional neural networks based pose estimation(KPE-DCNN)model for activity recognition.The KPE-DCNN technique initially converts the input video into a sequence of frames followed by a three stage process namely key point extraction,hyperparameter tuning,and pose estimation.In the keypoint extraction process an OpenPose model is designed to compute the accurate key-points in the human pose.Then,an optimal DCNN model is developed to classify the human activities label based on the extracted key points.For improving the training process of the DCNN technique,RMSProp optimizer is used to optimally adjust the hyperparameters such as learning rate,batch size,and epoch count.The experimental results tested using benchmark dataset like UCF sports dataset showed that KPE-DCNN technique is able to achieve good results compared with benchmark algorithms like CNN,DBN,SVM,STAL,T-CNN and so on.
文摘Lots of progress has been made recently on 2 D human pose tracking with tracking-by-detection approaches. However,several challenges still remain in this area which is due to self-occlusions and the confusion between the left and right limbs during tracking. In this work,a head orientation detection step is introduced into the tracking framework to serve as a complementary tool to assist human pose estimation. With the face orientation determined,the system can decide whether the left or right side of the human body is exactly visible and infer the state of the symmetric counterpart. By granting a higher priority for the completely visible side,the system can avoid double counting to a great extent when inferring body poses. The proposed framework is evaluated on the HumanEva dataset. The results show that it largely reduces the occurrence of double counting and distinguishes the left and right sides consistently.
基金National Natural Science Foundation of China(61806176)the Fundamental Research Funds for the Central Universities(2019QNA5022).
文摘Recovering human pose from RGB images and videos has drawn increasing attention in recent years owing to minimum sensor requirements and applicability in diverse fields such as human-computer interaction,robotics,video analytics,and augmented reality.Although a large amount of work has been devoted to this field,3D human pose estimation based on monocular images or videos remains a very challenging task due to a variety of difficulties such as depth ambiguities,occlusion,background clutters,and lack of training data.In this survey,we summarize recent advances in monocular 3D human pose estimation.We provide a general taxonomy to cover existing approaches and analyze their capabilities and limitations.We also present a summary of extensively used datasets and metrics,and provide a quantitative comparison of some representative methods.Finally,we conclude with a discussion on realistic challenges and open problems for future research directions.
基金supported in part by the US National Science Foundation(NSF)under Grants ECCS-1923163 and CNS-2107190through the Wireless Engineering Research and Education Center at Auburn University.
文摘Three-dimensional (3D) human pose tracking has recently attracted more and more attention in the computer vision field. Real-time pose tracking is highly useful in various domains such as video surveillance, somatosensory games, and human-computer interaction. However, vision-based pose tracking techniques usually raise privacy concerns, making human pose tracking without vision data usage an important problem. Thus, we propose using Radio Frequency Identification (RFID) as a pose tracking technique via a low-cost wearable sensing device. Although our prior work illustrated how deep learning could transfer RFID data into real-time human poses, generalization for different subjects remains challenging. This paper proposes a subject-adaptive technique to address this generalization problem. In the proposed system, termed Cycle-Pose, we leverage a cross-skeleton learning structure to improve the adaptability of the deep learning model to different human skeletons. Moreover, our novel cycle kinematic network is proposed for unpaired RFID and labeled pose data from different subjects. The Cycle-Pose system is implemented and evaluated by comparing its prototype with a traditional RFID pose tracking system. The experimental results demonstrate that Cycle-Pose can achieve lower estimation error and better subject generalization than the traditional system.
基金This research was supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)This work has also been supported by PrincessNourah bint Abdulrahman UniversityResearchers Supporting Project Number(PNURSP2022R239),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Alsothis work was partially supported by the Taif University Researchers Supporting Project Number(TURSP-2020/115),Taif University,Taif,Saudi Arabia.
文摘Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has attractedmany researchers to this field. Inspired by the existing recognition systems,this paper proposes a new and efficient human-object interaction recognition(HOIR) model which is based on modeling human pose and scene featureinformation. There are different aspects involved in an interaction, includingthe humans, the objects, the various body parts of the human, and the backgroundscene. Themain objectives of this research include critically examiningthe importance of all these elements in determining the interaction, estimatinghuman pose through image foresting transform (IFT), and detecting the performedinteractions based on an optimizedmulti-feature vector. The proposedmethodology has six main phases. The first phase involves preprocessing theimages. During preprocessing stages, the videos are converted into imageframes. Then their contrast is adjusted, and noise is removed. In the secondphase, the human-object pair is detected and extracted from each image frame.The third phase involves the identification of key body parts of the detectedhumans using IFT. The fourth phase relates to three different kinds of featureextraction techniques. Then these features are combined and optimized duringthe fifth phase. The optimized vector is used to classify the interactions in thelast phase. TheMSRDaily Activity 3D dataset has been used to test this modeland to prove its efficiency. The proposed system obtains an average accuracyof 91.7% on this dataset.
基金supported by the[Universiti Sains Malaysia]under FRGS Grant Number[FRGS/1/2020/STG07/USM/02/12(203.PKOMP.6711930)]FRGS Grant Number[304PTEKIND.6316497.USM.].
文摘In this article,a comprehensive survey of deep learning-based(DLbased)human pose estimation(HPE)that can help researchers in the domain of computer vision is presented.HPE is among the fastest-growing research domains of computer vision and is used in solving several problems for human endeavours.After the detailed introduction,three different human body modes followed by the main stages of HPE and two pipelines of twodimensional(2D)HPE are presented.The details of the four components of HPE are also presented.The keypoints output format of two popular 2D HPE datasets and the most cited DL-based HPE articles from the year of breakthrough are both shown in tabular form.This study intends to highlight the limitations of published reviews and surveys respecting presenting a systematic review of the current DL-based solution to the 2D HPE model.Furthermore,a detailed and meaningful survey that will guide new and existing researchers on DL-based 2D HPE models is achieved.Finally,some future research directions in the field of HPE,such as limited data on disabled persons and multi-training DL-based models,are revealed to encourage researchers and promote the growth of HPE research.