Six degrees of freedom(6DoF)input interfaces are essential formanipulating virtual objects through translation or rotation in three-dimensional(3D)space.A traditional outside-in tracking controller requires the instal...Six degrees of freedom(6DoF)input interfaces are essential formanipulating virtual objects through translation or rotation in three-dimensional(3D)space.A traditional outside-in tracking controller requires the installation of expensive hardware in advance.While inside-out tracking controllers have been proposed,they often suffer from limitations such as interaction limited to the tracking range of the sensor(e.g.,a sensor on the head-mounted display(HMD))or the need for pose value modification to function as an input interface(e.g.,a sensor on the controller).This study investigates 6DoF pose estimation methods without restricting the tracking range,using a smartphone as a controller in augmented reality(AR)environments.Our approach involves proposing methods for estimating the initial pose of the controller and correcting the pose using an inside-out tracking approach.In addition,seven pose estimation algorithms were presented as candidates depending on the tracking range of the device sensor,the tracking method(e.g.,marker recognition,visual-inertial odometry(VIO)),and whether modification of the initial pose is necessary.Through two experiments(discrete and continuous data),the performance of the algorithms was evaluated.The results demonstrate enhanced final pose accuracy achieved by correcting the initial pose.Furthermore,the importance of selecting the tracking algorithm based on the tracking range of the devices and the actual input value of the 3D interaction was emphasized.展开更多
This paper introduces a new algorithm for estimating the relative pose of a moving camera using consecutive frames of a video sequence. State-of-the-art algorithms for calculating the relative pose between two images ...This paper introduces a new algorithm for estimating the relative pose of a moving camera using consecutive frames of a video sequence. State-of-the-art algorithms for calculating the relative pose between two images use matching features to estimate the essential matrix. The essential matrix is then decomposed into the relative rotation and normalized translation between frames. To be robust to noise and feature match outliers, these methods generate a large number of essential matrix hypotheses from randomly selected minimal subsets of feature pairs, and then score these hypotheses on all feature pairs. Alternatively, the algorithm introduced in this paper calculates relative pose hypotheses by directly optimizing the rotation and normalized translation between frames, rather than calculating the essential matrix and then performing the decomposition. The resulting algorithm improves computation time by an order of magnitude. If an inertial measurement unit(IMU) is available, it is used to seed the optimizer, and in addition, we reuse the best hypothesis at each iteration to seed the optimizer thereby reducing the number of relative pose hypotheses that must be generated and scored. These advantages greatly speed up performance and enable the algorithm to run in real-time on low cost embedded hardware. We show application of our algorithm to visual multi-target tracking(MTT) in the presence of parallax and demonstrate its real-time performance on a 640 × 480 video sequence captured on a UAV. Video results are available at https://youtu.be/Hh K-p2 h XNn U.展开更多
Controlling multiple multi-joint fish-like robots has long captivated the attention of engineers and biologists,for which a fundamental but challenging topic is to robustly track the postures of the individuals in rea...Controlling multiple multi-joint fish-like robots has long captivated the attention of engineers and biologists,for which a fundamental but challenging topic is to robustly track the postures of the individuals in real time.This requires detecting multiple robots,estimating multi-joint postures,and tracking identities,as well as processing fast in real time.To the best of our knowledge,this challenge has not been tackled in the previous studies.In this paper,to precisely track the planar postures of multiple swimming multi-joint fish-like robots in real time,we propose a novel deep neural network-based method,named TAB-IOL.Its TAB part fuses the top-down and bottom-up approaches for vision-based pose estimation,while the IOL part with long short-term memory considers the motion constraints among joints for precise pose tracking.The satisfying performance of our TAB-IOL is verified by testing on a group of freely swimming fish-like robots in various scenarios with strong disturbances and by a deed comparison of accuracy,speed,and robustness with most state-of-the-art algorithms.Further,based on the precise pose estimation and tracking realized by our TAB-IOL,several formation control experiments are conducted for the group of fish-like robots.The results clearly demonstrate that our TAB-IOL lays a solid foundation for the coordination control of multiple fish-like robots in a real working environment.We believe our proposed method will facilitate the growth and development of related fields.展开更多
This paper tackles pose tracking and model refinement, one of the fundamental work for 3D photogrammetry. The researches belong to the videometrics, an interdisciplinewhich combines computer vision, digital image proc...This paper tackles pose tracking and model refinement, one of the fundamental work for 3D photogrammetry. The researches belong to the videometrics, an interdisciplinewhich combines computer vision, digital image processing, photogrammetry and optical measurement. Related works are summarized briefly in this paper. This paper studies the problem of pose tracking for target with 3D model. For the target with accurate 3D model, line model based pose tracking methods are proposed for target which is rich in line features. Experimental results indicate that the proposed methods track the target pose accurately. Normal distance iterative reweighted least squares and distance image iterative least squares methods are proposed to process more general targets. This paper adopts bundle adjustment to tackle pose tracking in image sequence for target with inaccurate 3D line model. The proposed method optimizes the model line parameters and the pose parameters simultaneously. The model line orientation, position and mean angle error, mean position error of the pose are 0.3°,3.5 mm and 0.12°,20.1 mm in simulation experiments of satellite pose tracking. Line features are used to track target pose with unknown 3D model through image sequence. The model line parameters and pose parameters are optimized under the framework of SFM. In simulation experiments, the reconstructed line orientation, position error and mean angle error, mean position error of pose are 0.4°,7.5 mm and 0.16°,23.5 mm.展开更多
Lots of progress has been made recently on 2 D human pose tracking with tracking-by-detection approaches. However,several challenges still remain in this area which is due to self-occlusions and the confusion between ...Lots of progress has been made recently on 2 D human pose tracking with tracking-by-detection approaches. However,several challenges still remain in this area which is due to self-occlusions and the confusion between the left and right limbs during tracking. In this work,a head orientation detection step is introduced into the tracking framework to serve as a complementary tool to assist human pose estimation. With the face orientation determined,the system can decide whether the left or right side of the human body is exactly visible and infer the state of the symmetric counterpart. By granting a higher priority for the completely visible side,the system can avoid double counting to a great extent when inferring body poses. The proposed framework is evaluated on the HumanEva dataset. The results show that it largely reduces the occurrence of double counting and distinguishes the left and right sides consistently.展开更多
Three-dimensional (3D) human pose tracking has recently attracted more and more attention in the computer vision field. Real-time pose tracking is highly useful in various domains such as video surveillance, somatosen...Three-dimensional (3D) human pose tracking has recently attracted more and more attention in the computer vision field. Real-time pose tracking is highly useful in various domains such as video surveillance, somatosensory games, and human-computer interaction. However, vision-based pose tracking techniques usually raise privacy concerns, making human pose tracking without vision data usage an important problem. Thus, we propose using Radio Frequency Identification (RFID) as a pose tracking technique via a low-cost wearable sensing device. Although our prior work illustrated how deep learning could transfer RFID data into real-time human poses, generalization for different subjects remains challenging. This paper proposes a subject-adaptive technique to address this generalization problem. In the proposed system, termed Cycle-Pose, we leverage a cross-skeleton learning structure to improve the adaptability of the deep learning model to different human skeletons. Moreover, our novel cycle kinematic network is proposed for unpaired RFID and labeled pose data from different subjects. The Cycle-Pose system is implemented and evaluated by comparing its prototype with a traditional RFID pose tracking system. The experimental results demonstrate that Cycle-Pose can achieve lower estimation error and better subject generalization than the traditional system.展开更多
In this paper, we propose multiple CAMShift Algorithm based on Kalman filter and weighted search windows that extracts skin color area and tracks several human body parts for real-time human tracking system. The CAMSh...In this paper, we propose multiple CAMShift Algorithm based on Kalman filter and weighted search windows that extracts skin color area and tracks several human body parts for real-time human tracking system. The CAMShift Algorithm we propose searches the skin color region by detecting the skin color area from background model. Kalman filter stabilizes the floated search area of CAMShift Algorithm. Each occlusion areas are avoided by using weighted window of non-search areas and main-search area. And shadows are eliminated from background model and intensity of shadow. The proposed modified Camshaft algorithm can estimate human pose in real-time and achieves 96.82% accuracy even in the case of occlusions.展开更多
Traditional monitoring systems that are used in shopping malls or com-munity management,mostly use a remote control to monitor and track specific objects;therefore,it is often impossible to effectively monitor the enti...Traditional monitoring systems that are used in shopping malls or com-munity management,mostly use a remote control to monitor and track specific objects;therefore,it is often impossible to effectively monitor the entire environ-ment.Whenfinding a suspicious person,the tracked object cannot be locked in time for tracking.This research replaces the traditionalfixed-point monitor with the intelligent drone and combines the image processing technology and automatic judgment for the movements of the monitored person.This intelligent system can effectively improve the shortcomings of low efficiency and high cost of the traditional monitor system.In this article,we proposed a TIMT(The Intel-ligent Monitoring and Tracking)algorithm which can make the drone have smart surveillance and tracking capabilities.It combined with Artificial Intelligent(AI)face recognition technology and the OpenPose which is able to monitor the phy-sical movements of multiple people in real time to analyze the meaning of human body movements and to track the monitored intelligently through the remote con-trol interface of the drone.This system is highly agile and could be adjusted immediately to any angle and screen that we monitor.Therefore,the system couldfind abnormal conditions immediately and track and monitor them automatically.That is the system can immediately detect when someone invades the home or community,and the drone can automatically track the intruder to achieve that the two significant shortcomings of the traditional monitor will be improved.Experimental results show that the intelligent monitoring and tracking drone sys-tem has an excellent performance,which not only dramatically reduces the num-ber of monitors and the required equipment but also achieves perfect monitoring and tracking.展开更多
Despite significant developments in 3D multi-view multi-person (3D MM) tracking, current frameworks separately target footprint tracking, or pose tracking. Frameworks designed for the former cannot be used for the lat...Despite significant developments in 3D multi-view multi-person (3D MM) tracking, current frameworks separately target footprint tracking, or pose tracking. Frameworks designed for the former cannot be used for the latter, because they directly obtain 3D positions on the ground plane via a homography projection, which is inapplicable to 3D poses above the ground. In contrast, frameworks designed for pose tracking generally isolate multi-view and multi-frame associations and may not be sufficiently robust for footprint tracking, which utilizes fewer key points than pose tracking, weakening multi-view association cues in a single frame. This study presents a unified multi-view multi-person tracking framework to bridge the gap between footprint tracking and pose tracking. Without additional modifications, the framework can adopt monocular 2D bounding boxes and 2D poses as its input to produce robust 3D trajectories for multiple persons. Importantly, multi-frame and multi-view information are jointly employed to improve association and triangulation. Our framework is shown to provide state-of-the-art performance on the Campus and Shelf datasets for 3D pose tracking, with comparable results on the WILDTRACK and MMPTRACK datasets for 3D footprint tracking.展开更多
We present a novel and efficient method for real-time multiple facial poses estimation and tracking in a single frame or video.First,we combine two standard convolutional neural network models for face detection and m...We present a novel and efficient method for real-time multiple facial poses estimation and tracking in a single frame or video.First,we combine two standard convolutional neural network models for face detection and mean shape learning to generate initial estimations of alignment and pose.Then,we design a bi-objective optimization strategy to iteratively refine the obtained estimations.This strategy achieves faster speed and more accurate outputs.Finally,we further apply algebraic filtering processing,including Gaussian filter for background removal and extended Kalman filter for target prediction,to maintain real-time tracking superiority.Only general RGB photos or videos are required,which are captured by a commodity monocular camera without any priori or label.We demonstrate the advantages of our approach by comparing it with the most recent work in terms of performance and accuracy.展开更多
文摘Six degrees of freedom(6DoF)input interfaces are essential formanipulating virtual objects through translation or rotation in three-dimensional(3D)space.A traditional outside-in tracking controller requires the installation of expensive hardware in advance.While inside-out tracking controllers have been proposed,they often suffer from limitations such as interaction limited to the tracking range of the sensor(e.g.,a sensor on the head-mounted display(HMD))or the need for pose value modification to function as an input interface(e.g.,a sensor on the controller).This study investigates 6DoF pose estimation methods without restricting the tracking range,using a smartphone as a controller in augmented reality(AR)environments.Our approach involves proposing methods for estimating the initial pose of the controller and correcting the pose using an inside-out tracking approach.In addition,seven pose estimation algorithms were presented as candidates depending on the tracking range of the device sensor,the tracking method(e.g.,marker recognition,visual-inertial odometry(VIO)),and whether modification of the initial pose is necessary.Through two experiments(discrete and continuous data),the performance of the algorithms was evaluated.The results demonstrate enhanced final pose accuracy achieved by correcting the initial pose.Furthermore,the importance of selecting the tracking algorithm based on the tracking range of the devices and the actual input value of the 3D interaction was emphasized.
基金funded by the Center for Unmanned Aircraft Systems(C-UAS)a National Science Foundation Industry/University Cooperative Research Center(I/UCRC)under NSF award Numbers IIP-1161036 and CNS-1650547along with significant contributions from C-UAS industry members。
文摘This paper introduces a new algorithm for estimating the relative pose of a moving camera using consecutive frames of a video sequence. State-of-the-art algorithms for calculating the relative pose between two images use matching features to estimate the essential matrix. The essential matrix is then decomposed into the relative rotation and normalized translation between frames. To be robust to noise and feature match outliers, these methods generate a large number of essential matrix hypotheses from randomly selected minimal subsets of feature pairs, and then score these hypotheses on all feature pairs. Alternatively, the algorithm introduced in this paper calculates relative pose hypotheses by directly optimizing the rotation and normalized translation between frames, rather than calculating the essential matrix and then performing the decomposition. The resulting algorithm improves computation time by an order of magnitude. If an inertial measurement unit(IMU) is available, it is used to seed the optimizer, and in addition, we reuse the best hypothesis at each iteration to seed the optimizer thereby reducing the number of relative pose hypotheses that must be generated and scored. These advantages greatly speed up performance and enable the algorithm to run in real-time on low cost embedded hardware. We show application of our algorithm to visual multi-target tracking(MTT) in the presence of parallax and demonstrate its real-time performance on a 640 × 480 video sequence captured on a UAV. Video results are available at https://youtu.be/Hh K-p2 h XNn U.
基金This work was supported in part by the National Natural Science Foundation of China(61973007,61633002).
文摘Controlling multiple multi-joint fish-like robots has long captivated the attention of engineers and biologists,for which a fundamental but challenging topic is to robustly track the postures of the individuals in real time.This requires detecting multiple robots,estimating multi-joint postures,and tracking identities,as well as processing fast in real time.To the best of our knowledge,this challenge has not been tackled in the previous studies.In this paper,to precisely track the planar postures of multiple swimming multi-joint fish-like robots in real time,we propose a novel deep neural network-based method,named TAB-IOL.Its TAB part fuses the top-down and bottom-up approaches for vision-based pose estimation,while the IOL part with long short-term memory considers the motion constraints among joints for precise pose tracking.The satisfying performance of our TAB-IOL is verified by testing on a group of freely swimming fish-like robots in various scenarios with strong disturbances and by a deed comparison of accuracy,speed,and robustness with most state-of-the-art algorithms.Further,based on the precise pose estimation and tracking realized by our TAB-IOL,several formation control experiments are conducted for the group of fish-like robots.The results clearly demonstrate that our TAB-IOL lays a solid foundation for the coordination control of multiple fish-like robots in a real working environment.We believe our proposed method will facilitate the growth and development of related fields.
基金The National Natural Science Foundation of China (11472302,11332012).
文摘This paper tackles pose tracking and model refinement, one of the fundamental work for 3D photogrammetry. The researches belong to the videometrics, an interdisciplinewhich combines computer vision, digital image processing, photogrammetry and optical measurement. Related works are summarized briefly in this paper. This paper studies the problem of pose tracking for target with 3D model. For the target with accurate 3D model, line model based pose tracking methods are proposed for target which is rich in line features. Experimental results indicate that the proposed methods track the target pose accurately. Normal distance iterative reweighted least squares and distance image iterative least squares methods are proposed to process more general targets. This paper adopts bundle adjustment to tackle pose tracking in image sequence for target with inaccurate 3D line model. The proposed method optimizes the model line parameters and the pose parameters simultaneously. The model line orientation, position and mean angle error, mean position error of the pose are 0.3°,3.5 mm and 0.12°,20.1 mm in simulation experiments of satellite pose tracking. Line features are used to track target pose with unknown 3D model through image sequence. The model line parameters and pose parameters are optimized under the framework of SFM. In simulation experiments, the reconstructed line orientation, position error and mean angle error, mean position error of pose are 0.4°,7.5 mm and 0.16°,23.5 mm.
文摘Lots of progress has been made recently on 2 D human pose tracking with tracking-by-detection approaches. However,several challenges still remain in this area which is due to self-occlusions and the confusion between the left and right limbs during tracking. In this work,a head orientation detection step is introduced into the tracking framework to serve as a complementary tool to assist human pose estimation. With the face orientation determined,the system can decide whether the left or right side of the human body is exactly visible and infer the state of the symmetric counterpart. By granting a higher priority for the completely visible side,the system can avoid double counting to a great extent when inferring body poses. The proposed framework is evaluated on the HumanEva dataset. The results show that it largely reduces the occurrence of double counting and distinguishes the left and right sides consistently.
基金supported in part by the US National Science Foundation(NSF)under Grants ECCS-1923163 and CNS-2107190through the Wireless Engineering Research and Education Center at Auburn University.
文摘Three-dimensional (3D) human pose tracking has recently attracted more and more attention in the computer vision field. Real-time pose tracking is highly useful in various domains such as video surveillance, somatosensory games, and human-computer interaction. However, vision-based pose tracking techniques usually raise privacy concerns, making human pose tracking without vision data usage an important problem. Thus, we propose using Radio Frequency Identification (RFID) as a pose tracking technique via a low-cost wearable sensing device. Although our prior work illustrated how deep learning could transfer RFID data into real-time human poses, generalization for different subjects remains challenging. This paper proposes a subject-adaptive technique to address this generalization problem. In the proposed system, termed Cycle-Pose, we leverage a cross-skeleton learning structure to improve the adaptability of the deep learning model to different human skeletons. Moreover, our novel cycle kinematic network is proposed for unpaired RFID and labeled pose data from different subjects. The Cycle-Pose system is implemented and evaluated by comparing its prototype with a traditional RFID pose tracking system. The experimental results demonstrate that Cycle-Pose can achieve lower estimation error and better subject generalization than the traditional system.
文摘In this paper, we propose multiple CAMShift Algorithm based on Kalman filter and weighted search windows that extracts skin color area and tracks several human body parts for real-time human tracking system. The CAMShift Algorithm we propose searches the skin color region by detecting the skin color area from background model. Kalman filter stabilizes the floated search area of CAMShift Algorithm. Each occlusion areas are avoided by using weighted window of non-search areas and main-search area. And shadows are eliminated from background model and intensity of shadow. The proposed modified Camshaft algorithm can estimate human pose in real-time and achieves 96.82% accuracy even in the case of occlusions.
文摘Traditional monitoring systems that are used in shopping malls or com-munity management,mostly use a remote control to monitor and track specific objects;therefore,it is often impossible to effectively monitor the entire environ-ment.Whenfinding a suspicious person,the tracked object cannot be locked in time for tracking.This research replaces the traditionalfixed-point monitor with the intelligent drone and combines the image processing technology and automatic judgment for the movements of the monitored person.This intelligent system can effectively improve the shortcomings of low efficiency and high cost of the traditional monitor system.In this article,we proposed a TIMT(The Intel-ligent Monitoring and Tracking)algorithm which can make the drone have smart surveillance and tracking capabilities.It combined with Artificial Intelligent(AI)face recognition technology and the OpenPose which is able to monitor the phy-sical movements of multiple people in real time to analyze the meaning of human body movements and to track the monitored intelligently through the remote con-trol interface of the drone.This system is highly agile and could be adjusted immediately to any angle and screen that we monitor.Therefore,the system couldfind abnormal conditions immediately and track and monitor them automatically.That is the system can immediately detect when someone invades the home or community,and the drone can automatically track the intruder to achieve that the two significant shortcomings of the traditional monitor will be improved.Experimental results show that the intelligent monitoring and tracking drone sys-tem has an excellent performance,which not only dramatically reduces the num-ber of monitors and the required equipment but also achieves perfect monitoring and tracking.
基金supported by the National Key Research and Development Program(No.2022YFB3306100)the Aeronautical Science Fund of China(No.2019ZE105001)the General Project of Chongqing Natural Science Foundation(No.cstc2019jcyjmsxmX0530).
文摘Despite significant developments in 3D multi-view multi-person (3D MM) tracking, current frameworks separately target footprint tracking, or pose tracking. Frameworks designed for the former cannot be used for the latter, because they directly obtain 3D positions on the ground plane via a homography projection, which is inapplicable to 3D poses above the ground. In contrast, frameworks designed for pose tracking generally isolate multi-view and multi-frame associations and may not be sufficiently robust for footprint tracking, which utilizes fewer key points than pose tracking, weakening multi-view association cues in a single frame. This study presents a unified multi-view multi-person tracking framework to bridge the gap between footprint tracking and pose tracking. Without additional modifications, the framework can adopt monocular 2D bounding boxes and 2D poses as its input to produce robust 3D trajectories for multiple persons. Importantly, multi-frame and multi-view information are jointly employed to improve association and triangulation. Our framework is shown to provide state-of-the-art performance on the Campus and Shelf datasets for 3D pose tracking, with comparable results on the WILDTRACK and MMPTRACK datasets for 3D footprint tracking.
基金supported by the National Natural Science Foundation of China(Nos.61872354,61772523,61620106003,and 61802406)the National Key R&D Program of China(No.2019YFB2204104)+2 种基金the Beijing Natural Science Foundation(Nos.L182059 and Z190004)the Intelligent Science and Technology Advanced Subject Project of University of Chinese Academy of Sciences(No.115200S001)the Alibaba Group through Alibaba Innovative Research Program。
文摘We present a novel and efficient method for real-time multiple facial poses estimation and tracking in a single frame or video.First,we combine two standard convolutional neural network models for face detection and mean shape learning to generate initial estimations of alignment and pose.Then,we design a bi-objective optimization strategy to iteratively refine the obtained estimations.This strategy achieves faster speed and more accurate outputs.Finally,we further apply algebraic filtering processing,including Gaussian filter for background removal and extended Kalman filter for target prediction,to maintain real-time tracking superiority.Only general RGB photos or videos are required,which are captured by a commodity monocular camera without any priori or label.We demonstrate the advantages of our approach by comparing it with the most recent work in terms of performance and accuracy.