Vision-based pose stabilization of nonholonomic mobile robots has received extensive attention. At present, most of the solutions of the problem do not take the robot dynamics into account in the controller design, so...Vision-based pose stabilization of nonholonomic mobile robots has received extensive attention. At present, most of the solutions of the problem do not take the robot dynamics into account in the controller design, so that these controllers are difficult to realize satisfactory control in practical application. Besides, many of the approaches suffer from the initial speed and torque jump which are not practical in the real world. Considering the kinematics and dynamics, a two-stage visual controller for solving the stabilization problem of a mobile robot is presented, applying the integration of adaptive control, sliding-mode control, and neural dynamics. In the first stage, an adaptive kinematic stabilization controller utilized to generate the command of velocity is developed based on Lyapunov theory. In the second stage, adopting the sliding-mode control approach, a dynamic controller with a variable speed function used to reduce the chattering is designed, which is utilized to generate the command of torque to make the actual velocity of the mobile robot asymptotically reach the desired velocity. Furthermore, to handle the speed and torque jump problems, the neural dynamics model is integrated into the above mentioned controllers. The stability of the proposed control system is analyzed by using Lyapunov theory. Finally, the simulation of the control law is implemented in perturbed case, and the results show that the control scheme can solve the stabilization problem effectively. The proposed control law can solve the speed and torque jump problems, overcome external disturbances, and provide a new solution for the vision-based stabilization of the mobile robot.展开更多
This paper presents a novel neural-fuzzy-based adaptive sliding mode automatic steering control strategy to improve the driving performance of vision-based unmanned electric vehicles with time-varying and uncertain pa...This paper presents a novel neural-fuzzy-based adaptive sliding mode automatic steering control strategy to improve the driving performance of vision-based unmanned electric vehicles with time-varying and uncertain parameters.Primarily,the kinematic and dynamic models which accurately express the steering behaviors of vehicles are constructed,and in which the relationship between the look-ahead time and vehicle velocity is revealed.Then,in order to overcome the external disturbances,parametric uncertainties and time-varying features of vehicles,a neural-fuzzy-based adaptive sliding mode automatic steering controller is proposed to supervise the lateral dynamic behavior of unmanned electric vehicles,which includes an equivalent control law and an adaptive variable structure control law.In this novel automatic steering control system of vehicles,a neural network system is utilized for approximating the switching control gain of variable structure control law,and a fuzzy inference system is presented to adjust the thickness of boundary layer in real-time.The stability of closed-loop neural-fuzzy-based adaptive sliding mode automatic steering control system is proven using the Lyapunov theory.Finally,the results illustrate that the presented control scheme has the excellent properties in term of error convergence and robustness.展开更多
In dynamic environments, the moving landmarks can make the accuracy of traditional vision-based pose estimation worse or even failure. To solve this problem, a robust Gaussian mixture model for vision-based pose estim...In dynamic environments, the moving landmarks can make the accuracy of traditional vision-based pose estimation worse or even failure. To solve this problem, a robust Gaussian mixture model for vision-based pose estimation is proposed. The motion index is added to the traditional graph-based vision-based pose estimation model to describe landmarks’ moving probability, transforming the classic Gaussian model to Gaussian mixture model, which can reduce the influence of moving landmarks for optimization results. To improve the algorithm’s robustness to noise, the covariance inflation model is employed in residual equations. The expectation maximization method for solving the Gaussian mixture problem is derived in detail, transforming the problem into classic iterative least square problem. Experimental results demonstrate that in dynamic environments, the proposed method outperforms the traditional method both in absolute accuracy and relative accuracy, while maintains high accuracy in static environments. The proposed method can effectively reduce the influence of the moving landmarks in dynamic environments, which is more suitable for the autonomous localization of mobile robots.展开更多
This paper presents a novel vision based localization algorithm from three-line structure ( TLS) .Two types of TLS are investigated: 1) three parallel lines ( Structure I) ; 2) two parallel lines and one orthogonal li...This paper presents a novel vision based localization algorithm from three-line structure ( TLS) .Two types of TLS are investigated: 1) three parallel lines ( Structure I) ; 2) two parallel lines and one orthogonal line ( Structure II) .From single image of either structure,the camera pose can be uniquely computed for vision localization.Contributions of this paper are as follows: 1 ) both TLS structures can be used as simple and practical landmarks,which are widely available in daily life; 2) the proposed algorithm complements existing localization methods,which usually use complex landmarks,especially in the partial blockage conditions; 3) compared with the general Perspective-3-Lines ( P3L) problem,camera pose can be uniquely computed from either structure.The proposed algorithm has been tested with both simulation and real image data.For a typical simulated indoor condition ( 75 cm-size landmark,less than 7.0 m landmark-to-camera distance,and 0.5-pixel image noises) ,the means of localization errors from Structure I and Structure II are less than 3.0 cm.And the standard deviations are less than 3.0 cm and 1.5 cm,respectively.The algorithm is further validated with two actual image experiments.Within a 7.5 m × 7.5 m indoor situation,the overall relative localization errors from Structure I and Structure II are less than 2.2% and 2.3% ,respectively,with about 6.0 m distance.The results demonstrate that the algorithm works well for practical vision localization.展开更多
EyeScreen is a vision-based interaction system which provides a natural gesture interface for humancomputer interaction (HCI) by tracking human fingers and recognizing gestures. Multi-view video images are captured ...EyeScreen is a vision-based interaction system which provides a natural gesture interface for humancomputer interaction (HCI) by tracking human fingers and recognizing gestures. Multi-view video images are captured by two cameras facing a computer screen, which can be used to detect clicking actions of a fingertip and improve the recognition rate. The system enables users to directly interact with rendered objects on the screen. Robustness of the system has been verified by extensive experiments with different user scenarios. EyeScreen can be used in many applications such as intelligent interaction and digital entertainment.展开更多
<div style="text-align:justify;"> <span style="font-family:Verdana;">Recovering from multiple traumatic brain injury (TBI) is a very difficult task, depending on the severity of the les...<div style="text-align:justify;"> <span style="font-family:Verdana;">Recovering from multiple traumatic brain injury (TBI) is a very difficult task, depending on the severity of the lesions, the affected parts of the brain and the level of damage (locomotor, cognitive or sensory). Although there are some software platforms to help these patients to recover part of the lost capacity, the variety of existing lesions and the different degree to which they affect the patient, do not allow the generalization of the appropriate treatments and tools in each case. The aim of this work is to design and evaluate a machine vision-based UI (User Interface) allowing patients with a high level of injury to interact with a computer. This UI will be a tool for the therapy they follow and a way to communicate with their environment. The interface provides a set of specific activities, developed in collaboration with the multidisciplinary team that is currently evaluating each patient, to be used as a part of the therapy they receive. The system has been successfully tested with two patients whose degree of disability prevents them from using other types of platforms.</span> </div>展开更多
This paper discusses the uncooperative target tracking control problem for the unmanned aerial vehicle(UAV)under the performance constraint and scaled relative velocity constraint,in which the states of the uncooperat...This paper discusses the uncooperative target tracking control problem for the unmanned aerial vehicle(UAV)under the performance constraint and scaled relative velocity constraint,in which the states of the uncooperative target can only be estimated through a vision sensor.Considering the limited detection range,a prescribed performance function is designed to ensure the transient and steady-state performances of the tracking system.Meanwhile,the scaled relative velocity constraint in the dynamic phase is taken into account,and a time-varying nonlinear transformation is used to solve the constraint problem,which not only overcomes the feasibility condition but also fails to violate the constraint boundaries.Finally,the practically prescribed-time stability technique is incorporated into the controller design procedure to guarantee that all signals within the closed-loop system are bounded.It is proved that the UAV can follow the uncooperative target at the desired relative position within a prescribed time,thereby improving the applicability of the vision-based tracking approach.Simulation results have been presented to prove the validity of the proposed control strategy.展开更多
Vision-based target motion estimation based Kalman filtering or least-squares estimators is an important problem in many tasks such as vision-based swarming or vision-based target pursuit.In this paper,we focus on a p...Vision-based target motion estimation based Kalman filtering or least-squares estimators is an important problem in many tasks such as vision-based swarming or vision-based target pursuit.In this paper,we focus on a problem that is very specific yet we believe important.That is,from the vision measurements,we can formulate various measurements.Which and how the measurements should be used?These problems are very fundamental,but we notice that practitioners usually do not pay special attention to them and often make mistakes.Motivated by this,we formulate three pseudo-linear measurements based on the bearing and angle measurements,which are standard vision measurements that can be obtained.Different estimators based on Kalman filtering and least-squares estimation are established and compared based on numerical experiments.It is revealed that correctly analyzing the covariance noises is critical for the Kalman filtering-based estimators.When the variance of the original measurement noise is unknown,the pseudo-linear least-squares estimator that has the smallest magnitude of the transformed noise can be a good choice.展开更多
The two topics of the article seem to have absolutely nothing to do with each other and,as can be expected in a contribution in honor and memory of Prof.Fritz Ackermann,they are linked in his person.Vision-based Navig...The two topics of the article seem to have absolutely nothing to do with each other and,as can be expected in a contribution in honor and memory of Prof.Fritz Ackermann,they are linked in his person.Vision-based Navigation was the focus of the doctoral thesis written by the author,the 29th and last PhD thesis supervised by Prof.Ackermann.The International Master’s Program Photogrammetry and Geoinformatics,which the author established with colleagues at Stuttgart University of Applied Sciences(HfT Stuttgart)in 1999,was a consequence of Prof.Ackermann’s benevolent promotion of international knowledge transfer in teaching.Both topics are reflected in this article;they provide further splashes of color in Prof.Ackermann’s oeuvre.展开更多
The paper concentrated on descripting the methods of UAV autonomous landing on moving target. GPS navigation and visionbased navigation were employed during different stage of autonomous landing in the simulation envi...The paper concentrated on descripting the methods of UAV autonomous landing on moving target. GPS navigation and visionbased navigation were employed during different stage of autonomous landing in the simulation environment and virtual reality.Uncertain markers estimation is the main step for UAV autonomous landing. It contains the convex hull transformation,interference preclusion, ellipse fitting and specific feature matching. Furthermore, the complete visual measurement program and guidance strategy were proposed in this paper. Considerable comprehensive experiments indicated the significance and feasibility of method of vision-based UAV autonomous landing on moving target.展开更多
Since GPS signals are unavailable for indoor navigation, current research mainly focuses on vision-based locating with a single mark. An obvious disadvantage with this approach is that locating will fail when the mark...Since GPS signals are unavailable for indoor navigation, current research mainly focuses on vision-based locating with a single mark. An obvious disadvantage with this approach is that locating will fail when the mark cannot be seen. The use of multiple marks can solve this problem. However, the extra process to design and identify different marks will significantly increase system complexity. In this paper, a novel vision-based locating method is proposed by using marks with feature points arranged in a radial shape. The feature points of the marks consist of inner points and outer points. The positions of the inner points are the same in all marks, while the positions of the outer points are different in different marks. Unlike traditional camera locating methods (the PnP methods), the proposed method can calculate the camera location and the positions of the outer points simultaneously. Then the calculation results of the positions of the outer points are used to identify the mark. This method can make navigation with multiple marks more efficient. Simulations and real world experiments are carried out, and their results show that the proposed method is fast, accurate and robust to noise.展开更多
Space manipulator has been playing an increasingly important role in space exploration due to its flexibility and versatility. This paper is to design a vision-based pose measurement system for a four-degree-of-freedo...Space manipulator has been playing an increasingly important role in space exploration due to its flexibility and versatility. This paper is to design a vision-based pose measurement system for a four-degree-of-freedom(4-DOF) lunar surface sampling manipulator relying on a monitoring camera and several fiducial markers. The system first employs double plateaus histogram equalization for the markers to improve the robustness to varying noise and illumination. The markers are then accurately extracted in sub-pixel based on template matching and curved surface fitting. Finally, given the camera parameters and 3D reference points, the pose of the manipulator end-effector is solved from the 3D-to-2D point correspondences by combining a plane-based pose estimation method with rigid-body transformation. Experiment results show that the system achieves highprecision positioning and orientation performance. The measurement error is within 3 mm in position, and 0.2° in orientation,meeting the requirements for space manipulator operations.展开更多
Recently,vision-based gesture recognition(VGR)has become a hot research spot in human-computer interaction(HCI).Unlike other gesture recognition methods with data gloves or other wearable sensors,vision-based gesture ...Recently,vision-based gesture recognition(VGR)has become a hot research spot in human-computer interaction(HCI).Unlike other gesture recognition methods with data gloves or other wearable sensors,vision-based gesture recognition could lead to more natural and intuitive HCI interactions.This paper reviews the state-of-the-art vision-based gestures recognition methods,from different stages of gesture recognition process,i.e.,(1)image acquisition and pre-processing,(2)gesture segmentation,(3)gesture tracking,(4)feature extraction,and(5)gesture classification.This paper also analyzes the advantages and disadvantages of these various methods in detail.Finally,the challenges of vision-based gesture recognition in haptic rendering and future research directions are discussed.展开更多
This paper presents algorithms for vision-based tracking and classification of vehicles in image sequences of traffic scenes recorded by a stationary camera. In the algorithms, the central moment and extended Kalman f...This paper presents algorithms for vision-based tracking and classification of vehicles in image sequences of traffic scenes recorded by a stationary camera. In the algorithms, the central moment and extended Kalman filter of tracking processes optimizes the amount of spent computational resources. Moreover, it robust to many difficult situations such as partial or full occlusions of vehicles. Vehicle classification performance is improved by Bayesian network, especially from incomplete data. The methods are test on a single Intel Pentium 4 processor 2.4 GHz and the frame rate is 25 frames/s. Experimental results from highway scenes are provided, which demonstrate the effectiveness and robust of the methods.展开更多
Conventional outdoor navigation systems are usually based on orbital satellites, e.g., global positioning system (GPS) and global navigation satellite system (GLONASS). The latest advances from wearable, e.g., Bai...Conventional outdoor navigation systems are usually based on orbital satellites, e.g., global positioning system (GPS) and global navigation satellite system (GLONASS). The latest advances from wearable, e.g., BaiduEye and Google Glass, have enabled new approaches to leverage information from the surrounding environment. For example, they enable the change from passively receiving information to actively requesting information. Thus, such changes might inspire brand new application scenarios that were not possible before. In this work, we propose a vision-based navigation system based on wearable like Baidu Eye. We discuss the associated challenges and propose potential solutions for each of them. The system utilizes crowd sensing to collect and build a traffic signpost database for positioning reference. Then it leverages context information, such as cell identification (Cell ID), signal strength, and altitude combined with traffic sign detection and recognition to enable real-time positioning. A hybrid cloud architecture is proposed to enhance the capability of sensing devices (SD) to realize the proposed vision.展开更多
Non-contact sensing can be a rapid and convenient alternative for determining structure response compared to conventional instrumentation.Computer vision has been broadly implemented to enable accurate non-contact dyn...Non-contact sensing can be a rapid and convenient alternative for determining structure response compared to conventional instrumentation.Computer vision has been broadly implemented to enable accurate non-contact dynamic response measurements for structures.This study has analyzed the effect of non-contact sensors,including type,frame rate,and data collection platform,on the performance of a novel motion detection technique.Video recordings of a cantilever column were collected using a high-speed camera mounted on a tripod and an unmanned aerial system(UAS)equipped with visual and thermal sensors.The test specimen was subjected to an initial deformation and released.Specimen acceleration data were collected using an accelerometer installed on the cantilever end.The displacement from each non-contact sensor and the acceleration from the contact sensor were analyzed to measure the specimen′s natural frequency and damping ratio.The specimen′s first fundamental frequency and damping ratio results were validated by analyzing acceleration data from the top of the specimen and a finite element model.展开更多
Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside th...Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside the advantages,depth-sensing also presents many practical challenges.For instance,the depth sensors impose an additional payload burden on the robotic inspection platforms limiting the operation time and increasing the inspection cost.Additionally,some lidar-based depth sensors have poor outdoor performance due to sunlight contamination during the daytime.In this context,this study investigates the feasibility of abolishing depth-sensing at test time without compromising the segmentation performance.An autonomous damage segmentation framework is developed,based on recent advancements in vision-based multi-modal sensing such as modality hallucination(MH)and monocular depth estimation(MDE),which require depth data only during the model training.At the time of deployment,depth data becomes expendable as it can be simulated from the corresponding RGB frames.This makes it possible to reap the benefits of depth fusion without any depth perception per se.This study explored two different depth encoding techniques and three different fusion strategies in addition to a baseline RGB-based model.The proposed approach is validated on computer-generated RGB-D data of reinforced concrete buildings subjected to seismic damage.It was observed that the surrogate techniques can increase the segmentation IoU by up to 20.1%with a negligible increase in the computation cost.Overall,this study is believed to make a positive contribution to enhancing the resilience of critical civil infrastructure.展开更多
Optical and visual measurement technology is used widely in fields that involve geometric measurements,and among such technology are laser and vision-based displacement measuring modules(LVDMMs).The displacement trans...Optical and visual measurement technology is used widely in fields that involve geometric measurements,and among such technology are laser and vision-based displacement measuring modules(LVDMMs).The displacement transformation coefficient(DTC)of an LVDMM changes with the coordinates in the camera image coordinate system during the displacement measuring process,and these changes affect the displacement measurement accuracy of LVDMMs in the full field of view(FFOV).To give LVDMMs higher accuracy in the FFOV and make them adaptable to widely varying measurement demands,a new calibration method is proposed to improve the displacement measurement accuracy of LVDMMs in the FFOV.First,an image coordinate system,a pixel measurement coordinate system,and a displacement measurement coordinate system are established on the laser receiving screen of the LVDMM.In addition,marker spots in the FFOV are selected,and the DTCs at the marker spots are obtained from calibration experiments.Also,a fitting method based on locally weighted scatterplot smoothing(LOWESS)is selected,and with this fitting method the distribution functions of the DTCs in the FFOV are obtained based on the DTCs at the marker spots.Finally,the calibrated distribution functions of the DTCs are applied to the LVDMM,and experiments conducted to verify the displacement measurement accuracies are reported.The results show that the FFOV measurement accuracies for horizontal and vertical displacements are better than±15μm and±19μm,respectively,and that for oblique displacement is better than±24μm.Compared with the traditional calibration method,the displacement measurement error in the FFOV is now 90%smaller.This research on an improved calibration method has certain significance for improving the measurement accuracy of LVDMMs in the FFOV,and it provides a new method and idea for other vision-based fields in which camera parameters must be calibrated.展开更多
Intelligent vision-based surveillance systems are designed to deal with the gigantic volume of videos captured in a particular environment to perform the interpretation of scenes in form of detection,tracking,monitori...Intelligent vision-based surveillance systems are designed to deal with the gigantic volume of videos captured in a particular environment to perform the interpretation of scenes in form of detection,tracking,monitoring,behavioral analysis,and retrievals.In addition to that,another evolving way of surveillance systems in a particular environment is human gait-based surveillance.In the existing research,several methodological frameworks are designed to use deep learning and traditional methods,nevertheless,the accuracies of these methods drop substantially when they are subjected to covariate conditions.These covariate variables disrupt the gait features and hence the recognition of subjects becomes difficult.To handle these issues,a region-based triplet-branch Convolutional Neural Network(CNN)is proposed in this research that is focused on different parts of the human Gait Energy Image(GEI)including the head,legs,and body separately to classify the subjects,and later on,the final identification of subjects is decided by probability-based majority voting criteria.Moreover,to enhance the feature extraction and draw the discriminative features,we have added soft attention layers on each branch to generate the soft attention maps.The proposed model is validated on the CASIA-B database and findings indicate that part-based learning through triplet-branch CNN shows good performance of 72.98%under covariate conditions as well as also outperforms single-branch CNN models.展开更多
This paper introduces a new algorithm for estimating the relative pose of a moving camera using consecutive frames of a video sequence. State-of-the-art algorithms for calculating the relative pose between two images ...This paper introduces a new algorithm for estimating the relative pose of a moving camera using consecutive frames of a video sequence. State-of-the-art algorithms for calculating the relative pose between two images use matching features to estimate the essential matrix. The essential matrix is then decomposed into the relative rotation and normalized translation between frames. To be robust to noise and feature match outliers, these methods generate a large number of essential matrix hypotheses from randomly selected minimal subsets of feature pairs, and then score these hypotheses on all feature pairs. Alternatively, the algorithm introduced in this paper calculates relative pose hypotheses by directly optimizing the rotation and normalized translation between frames, rather than calculating the essential matrix and then performing the decomposition. The resulting algorithm improves computation time by an order of magnitude. If an inertial measurement unit(IMU) is available, it is used to seed the optimizer, and in addition, we reuse the best hypothesis at each iteration to seed the optimizer thereby reducing the number of relative pose hypotheses that must be generated and scored. These advantages greatly speed up performance and enable the algorithm to run in real-time on low cost embedded hardware. We show application of our algorithm to visual multi-target tracking(MTT) in the presence of parallax and demonstrate its real-time performance on a 640 × 480 video sequence captured on a UAV. Video results are available at https://youtu.be/Hh K-p2 h XNn U.展开更多
基金supported by National Key Basic Research and Development Program of China (973 Program,Grant No. 2009CB320602)National Natural Science Foundation of China (Grant Nos. 60834004,61025018)+2 种基金National Science and Technology Major Project of China(Grant No. 2011ZX02504-008)Fundamental Research Funds for the Central Universities of China (Grant No. ZZ1222)Key Laboratory of Advanced Engineering Surveying of NASMG of China (Grant No.TJES1106)
文摘Vision-based pose stabilization of nonholonomic mobile robots has received extensive attention. At present, most of the solutions of the problem do not take the robot dynamics into account in the controller design, so that these controllers are difficult to realize satisfactory control in practical application. Besides, many of the approaches suffer from the initial speed and torque jump which are not practical in the real world. Considering the kinematics and dynamics, a two-stage visual controller for solving the stabilization problem of a mobile robot is presented, applying the integration of adaptive control, sliding-mode control, and neural dynamics. In the first stage, an adaptive kinematic stabilization controller utilized to generate the command of velocity is developed based on Lyapunov theory. In the second stage, adopting the sliding-mode control approach, a dynamic controller with a variable speed function used to reduce the chattering is designed, which is utilized to generate the command of torque to make the actual velocity of the mobile robot asymptotically reach the desired velocity. Furthermore, to handle the speed and torque jump problems, the neural dynamics model is integrated into the above mentioned controllers. The stability of the proposed control system is analyzed by using Lyapunov theory. Finally, the simulation of the control law is implemented in perturbed case, and the results show that the control scheme can solve the stabilization problem effectively. The proposed control law can solve the speed and torque jump problems, overcome external disturbances, and provide a new solution for the vision-based stabilization of the mobile robot.
基金Supported by National Basic Research Project of China(Grant No.2016YFB0100900)National Natural Science Foundation of China(Grant No.61803319)+2 种基金Shenzhen Municipal Science and Technology Projects of China(Grant No.JCYJ20180306172720364)Fundamental Research Funds for the Central Universities of China(Grant No.20720190015)State Key Laboratory of Automotive Safety and Energy of China(Grant No.KF2011).
文摘This paper presents a novel neural-fuzzy-based adaptive sliding mode automatic steering control strategy to improve the driving performance of vision-based unmanned electric vehicles with time-varying and uncertain parameters.Primarily,the kinematic and dynamic models which accurately express the steering behaviors of vehicles are constructed,and in which the relationship between the look-ahead time and vehicle velocity is revealed.Then,in order to overcome the external disturbances,parametric uncertainties and time-varying features of vehicles,a neural-fuzzy-based adaptive sliding mode automatic steering controller is proposed to supervise the lateral dynamic behavior of unmanned electric vehicles,which includes an equivalent control law and an adaptive variable structure control law.In this novel automatic steering control system of vehicles,a neural network system is utilized for approximating the switching control gain of variable structure control law,and a fuzzy inference system is presented to adjust the thickness of boundary layer in real-time.The stability of closed-loop neural-fuzzy-based adaptive sliding mode automatic steering control system is proven using the Lyapunov theory.Finally,the results illustrate that the presented control scheme has the excellent properties in term of error convergence and robustness.
文摘In dynamic environments, the moving landmarks can make the accuracy of traditional vision-based pose estimation worse or even failure. To solve this problem, a robust Gaussian mixture model for vision-based pose estimation is proposed. The motion index is added to the traditional graph-based vision-based pose estimation model to describe landmarks’ moving probability, transforming the classic Gaussian model to Gaussian mixture model, which can reduce the influence of moving landmarks for optimization results. To improve the algorithm’s robustness to noise, the covariance inflation model is employed in residual equations. The expectation maximization method for solving the Gaussian mixture problem is derived in detail, transforming the problem into classic iterative least square problem. Experimental results demonstrate that in dynamic environments, the proposed method outperforms the traditional method both in absolute accuracy and relative accuracy, while maintains high accuracy in static environments. The proposed method can effectively reduce the influence of the moving landmarks in dynamic environments, which is more suitable for the autonomous localization of mobile robots.
基金Sponsored by the National Natural Science Foundation of China (Grant No. 51208168)the Research Grant from the Department of Education of Liaoning Province (Grant No. L2010060)
文摘This paper presents a novel vision based localization algorithm from three-line structure ( TLS) .Two types of TLS are investigated: 1) three parallel lines ( Structure I) ; 2) two parallel lines and one orthogonal line ( Structure II) .From single image of either structure,the camera pose can be uniquely computed for vision localization.Contributions of this paper are as follows: 1 ) both TLS structures can be used as simple and practical landmarks,which are widely available in daily life; 2) the proposed algorithm complements existing localization methods,which usually use complex landmarks,especially in the partial blockage conditions; 3) compared with the general Perspective-3-Lines ( P3L) problem,camera pose can be uniquely computed from either structure.The proposed algorithm has been tested with both simulation and real image data.For a typical simulated indoor condition ( 75 cm-size landmark,less than 7.0 m landmark-to-camera distance,and 0.5-pixel image noises) ,the means of localization errors from Structure I and Structure II are less than 3.0 cm.And the standard deviations are less than 3.0 cm and 1.5 cm,respectively.The algorithm is further validated with two actual image experiments.Within a 7.5 m × 7.5 m indoor situation,the overall relative localization errors from Structure I and Structure II are less than 2.2% and 2.3% ,respectively,with about 6.0 m distance.The results demonstrate that the algorithm works well for practical vision localization.
基金Sponsored by the National Natural Science Foundation of China(60473049)the National Hi-Tech R&D programof China(2006AA01Z120)
文摘EyeScreen is a vision-based interaction system which provides a natural gesture interface for humancomputer interaction (HCI) by tracking human fingers and recognizing gestures. Multi-view video images are captured by two cameras facing a computer screen, which can be used to detect clicking actions of a fingertip and improve the recognition rate. The system enables users to directly interact with rendered objects on the screen. Robustness of the system has been verified by extensive experiments with different user scenarios. EyeScreen can be used in many applications such as intelligent interaction and digital entertainment.
文摘<div style="text-align:justify;"> <span style="font-family:Verdana;">Recovering from multiple traumatic brain injury (TBI) is a very difficult task, depending on the severity of the lesions, the affected parts of the brain and the level of damage (locomotor, cognitive or sensory). Although there are some software platforms to help these patients to recover part of the lost capacity, the variety of existing lesions and the different degree to which they affect the patient, do not allow the generalization of the appropriate treatments and tools in each case. The aim of this work is to design and evaluate a machine vision-based UI (User Interface) allowing patients with a high level of injury to interact with a computer. This UI will be a tool for the therapy they follow and a way to communicate with their environment. The interface provides a set of specific activities, developed in collaboration with the multidisciplinary team that is currently evaluating each patient, to be used as a part of the therapy they receive. The system has been successfully tested with two patients whose degree of disability prevents them from using other types of platforms.</span> </div>
基金supported by the National Natural Science Foundation of China under Grant Nos.62033003,62203119,62373113,U23A20341,and U21A20522the Natural Science Foundation of Guangdong Province under Grant Nos.2023A1515011527 and 2022A1515011506.
文摘This paper discusses the uncooperative target tracking control problem for the unmanned aerial vehicle(UAV)under the performance constraint and scaled relative velocity constraint,in which the states of the uncooperative target can only be estimated through a vision sensor.Considering the limited detection range,a prescribed performance function is designed to ensure the transient and steady-state performances of the tracking system.Meanwhile,the scaled relative velocity constraint in the dynamic phase is taken into account,and a time-varying nonlinear transformation is used to solve the constraint problem,which not only overcomes the feasibility condition but also fails to violate the constraint boundaries.Finally,the practically prescribed-time stability technique is incorporated into the controller design procedure to guarantee that all signals within the closed-loop system are bounded.It is proved that the UAV can follow the uncooperative target at the desired relative position within a prescribed time,thereby improving the applicability of the vision-based tracking approach.Simulation results have been presented to prove the validity of the proposed control strategy.
文摘Vision-based target motion estimation based Kalman filtering or least-squares estimators is an important problem in many tasks such as vision-based swarming or vision-based target pursuit.In this paper,we focus on a problem that is very specific yet we believe important.That is,from the vision measurements,we can formulate various measurements.Which and how the measurements should be used?These problems are very fundamental,but we notice that practitioners usually do not pay special attention to them and often make mistakes.Motivated by this,we formulate three pseudo-linear measurements based on the bearing and angle measurements,which are standard vision measurements that can be obtained.Different estimators based on Kalman filtering and least-squares estimation are established and compared based on numerical experiments.It is revealed that correctly analyzing the covariance noises is critical for the Kalman filtering-based estimators.When the variance of the original measurement noise is unknown,the pseudo-linear least-squares estimator that has the smallest magnitude of the transformed noise can be a good choice.
文摘The two topics of the article seem to have absolutely nothing to do with each other and,as can be expected in a contribution in honor and memory of Prof.Fritz Ackermann,they are linked in his person.Vision-based Navigation was the focus of the doctoral thesis written by the author,the 29th and last PhD thesis supervised by Prof.Ackermann.The International Master’s Program Photogrammetry and Geoinformatics,which the author established with colleagues at Stuttgart University of Applied Sciences(HfT Stuttgart)in 1999,was a consequence of Prof.Ackermann’s benevolent promotion of international knowledge transfer in teaching.Both topics are reflected in this article;they provide further splashes of color in Prof.Ackermann’s oeuvre.
基金partially supported by the National Natural Science Foundation of China(Grant Nos.61425008&91648205)
文摘The paper concentrated on descripting the methods of UAV autonomous landing on moving target. GPS navigation and visionbased navigation were employed during different stage of autonomous landing in the simulation environment and virtual reality.Uncertain markers estimation is the main step for UAV autonomous landing. It contains the convex hull transformation,interference preclusion, ellipse fitting and specific feature matching. Furthermore, the complete visual measurement program and guidance strategy were proposed in this paper. Considerable comprehensive experiments indicated the significance and feasibility of method of vision-based UAV autonomous landing on moving target.
基金supported by National Basic Research Program of China (No.2010CB731800)
文摘Since GPS signals are unavailable for indoor navigation, current research mainly focuses on vision-based locating with a single mark. An obvious disadvantage with this approach is that locating will fail when the mark cannot be seen. The use of multiple marks can solve this problem. However, the extra process to design and identify different marks will significantly increase system complexity. In this paper, a novel vision-based locating method is proposed by using marks with feature points arranged in a radial shape. The feature points of the marks consist of inner points and outer points. The positions of the inner points are the same in all marks, while the positions of the outer points are different in different marks. Unlike traditional camera locating methods (the PnP methods), the proposed method can calculate the camera location and the positions of the outer points simultaneously. Then the calculation results of the positions of the outer points are used to identify the mark. This method can make navigation with multiple marks more efficient. Simulations and real world experiments are carried out, and their results show that the proposed method is fast, accurate and robust to noise.
基金supported by the National Natural Science Foundation of China(Grant Nos.11727804,11872070)the Hunan Provincial Natural Science Foundation of China(Grant No.2019JJ50732)
文摘Space manipulator has been playing an increasingly important role in space exploration due to its flexibility and versatility. This paper is to design a vision-based pose measurement system for a four-degree-of-freedom(4-DOF) lunar surface sampling manipulator relying on a monitoring camera and several fiducial markers. The system first employs double plateaus histogram equalization for the markers to improve the robustness to varying noise and illumination. The markers are then accurately extracted in sub-pixel based on template matching and curved surface fitting. Finally, given the camera parameters and 3D reference points, the pose of the manipulator end-effector is solved from the 3D-to-2D point correspondences by combining a plane-based pose estimation method with rigid-body transformation. Experiment results show that the system achieves highprecision positioning and orientation performance. The measurement error is within 3 mm in position, and 0.2° in orientation,meeting the requirements for space manipulator operations.
基金Supported by the National Natural Science Foundation of China(61773205,61773219)the Fundamental Research Funds for the Central Universities(NS2016032,NS2019018,Nanjing University of Aeronautics and Astronautics)+1 种基金the Scholarship from China Scholarship Council(201906835020)the Fundamental Research Funds for the Central Universities(the Graduate Student Innovation Base Open Fund Project of NUAA,kfjj20190307)。
文摘Recently,vision-based gesture recognition(VGR)has become a hot research spot in human-computer interaction(HCI).Unlike other gesture recognition methods with data gloves or other wearable sensors,vision-based gesture recognition could lead to more natural and intuitive HCI interactions.This paper reviews the state-of-the-art vision-based gestures recognition methods,from different stages of gesture recognition process,i.e.,(1)image acquisition and pre-processing,(2)gesture segmentation,(3)gesture tracking,(4)feature extraction,and(5)gesture classification.This paper also analyzes the advantages and disadvantages of these various methods in detail.Finally,the challenges of vision-based gesture recognition in haptic rendering and future research directions are discussed.
文摘This paper presents algorithms for vision-based tracking and classification of vehicles in image sequences of traffic scenes recorded by a stationary camera. In the algorithms, the central moment and extended Kalman filter of tracking processes optimizes the amount of spent computational resources. Moreover, it robust to many difficult situations such as partial or full occlusions of vehicles. Vehicle classification performance is improved by Bayesian network, especially from incomplete data. The methods are test on a single Intel Pentium 4 processor 2.4 GHz and the frame rate is 25 frames/s. Experimental results from highway scenes are provided, which demonstrate the effectiveness and robust of the methods.
基金supported by the National Key Project of Scientific and Technical Supporting Programs of China (2014BAK15B01)
文摘Conventional outdoor navigation systems are usually based on orbital satellites, e.g., global positioning system (GPS) and global navigation satellite system (GLONASS). The latest advances from wearable, e.g., BaiduEye and Google Glass, have enabled new approaches to leverage information from the surrounding environment. For example, they enable the change from passively receiving information to actively requesting information. Thus, such changes might inspire brand new application scenarios that were not possible before. In this work, we propose a vision-based navigation system based on wearable like Baidu Eye. We discuss the associated challenges and propose potential solutions for each of them. The system utilizes crowd sensing to collect and build a traffic signpost database for positioning reference. Then it leverages context information, such as cell identification (Cell ID), signal strength, and altitude combined with traffic sign detection and recognition to enable real-time positioning. A hybrid cloud architecture is proposed to enhance the capability of sensing devices (SD) to realize the proposed vision.
文摘Non-contact sensing can be a rapid and convenient alternative for determining structure response compared to conventional instrumentation.Computer vision has been broadly implemented to enable accurate non-contact dynamic response measurements for structures.This study has analyzed the effect of non-contact sensors,including type,frame rate,and data collection platform,on the performance of a novel motion detection technique.Video recordings of a cantilever column were collected using a high-speed camera mounted on a tripod and an unmanned aerial system(UAS)equipped with visual and thermal sensors.The test specimen was subjected to an initial deformation and released.Specimen acceleration data were collected using an accelerometer installed on the cantilever end.The displacement from each non-contact sensor and the acceleration from the contact sensor were analyzed to measure the specimen′s natural frequency and damping ratio.The specimen′s first fundamental frequency and damping ratio results were validated by analyzing acceleration data from the top of the specimen and a finite element model.
基金supported in part by a fund from Bentley Systems,Inc.
文摘Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside the advantages,depth-sensing also presents many practical challenges.For instance,the depth sensors impose an additional payload burden on the robotic inspection platforms limiting the operation time and increasing the inspection cost.Additionally,some lidar-based depth sensors have poor outdoor performance due to sunlight contamination during the daytime.In this context,this study investigates the feasibility of abolishing depth-sensing at test time without compromising the segmentation performance.An autonomous damage segmentation framework is developed,based on recent advancements in vision-based multi-modal sensing such as modality hallucination(MH)and monocular depth estimation(MDE),which require depth data only during the model training.At the time of deployment,depth data becomes expendable as it can be simulated from the corresponding RGB frames.This makes it possible to reap the benefits of depth fusion without any depth perception per se.This study explored two different depth encoding techniques and three different fusion strategies in addition to a baseline RGB-based model.The proposed approach is validated on computer-generated RGB-D data of reinforced concrete buildings subjected to seismic damage.It was observed that the surrogate techniques can increase the segmentation IoU by up to 20.1%with a negligible increase in the computation cost.Overall,this study is believed to make a positive contribution to enhancing the resilience of critical civil infrastructure.
基金supported financially by the National Natural Science Foundation of China (NSFC) (Grant No.51775378)the Key Projects in Tianjin Science&Technology Support Program (Grant No.19YFZC GX00890).
文摘Optical and visual measurement technology is used widely in fields that involve geometric measurements,and among such technology are laser and vision-based displacement measuring modules(LVDMMs).The displacement transformation coefficient(DTC)of an LVDMM changes with the coordinates in the camera image coordinate system during the displacement measuring process,and these changes affect the displacement measurement accuracy of LVDMMs in the full field of view(FFOV).To give LVDMMs higher accuracy in the FFOV and make them adaptable to widely varying measurement demands,a new calibration method is proposed to improve the displacement measurement accuracy of LVDMMs in the FFOV.First,an image coordinate system,a pixel measurement coordinate system,and a displacement measurement coordinate system are established on the laser receiving screen of the LVDMM.In addition,marker spots in the FFOV are selected,and the DTCs at the marker spots are obtained from calibration experiments.Also,a fitting method based on locally weighted scatterplot smoothing(LOWESS)is selected,and with this fitting method the distribution functions of the DTCs in the FFOV are obtained based on the DTCs at the marker spots.Finally,the calibrated distribution functions of the DTCs are applied to the LVDMM,and experiments conducted to verify the displacement measurement accuracies are reported.The results show that the FFOV measurement accuracies for horizontal and vertical displacements are better than±15μm and±19μm,respectively,and that for oblique displacement is better than±24μm.Compared with the traditional calibration method,the displacement measurement error in the FFOV is now 90%smaller.This research on an improved calibration method has certain significance for improving the measurement accuracy of LVDMMs in the FFOV,and it provides a new method and idea for other vision-based fields in which camera parameters must be calibrated.
基金supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No.2022R1F1A1063134)the MSIT (Ministry of Science and ICT),Korea,under the ITRC (Information Technology Research Center)Support Program (IITP-2022-2018-0-01799)supervised by the IITP (Institute for Information&communications Technology Planning&Evaluation).
文摘Intelligent vision-based surveillance systems are designed to deal with the gigantic volume of videos captured in a particular environment to perform the interpretation of scenes in form of detection,tracking,monitoring,behavioral analysis,and retrievals.In addition to that,another evolving way of surveillance systems in a particular environment is human gait-based surveillance.In the existing research,several methodological frameworks are designed to use deep learning and traditional methods,nevertheless,the accuracies of these methods drop substantially when they are subjected to covariate conditions.These covariate variables disrupt the gait features and hence the recognition of subjects becomes difficult.To handle these issues,a region-based triplet-branch Convolutional Neural Network(CNN)is proposed in this research that is focused on different parts of the human Gait Energy Image(GEI)including the head,legs,and body separately to classify the subjects,and later on,the final identification of subjects is decided by probability-based majority voting criteria.Moreover,to enhance the feature extraction and draw the discriminative features,we have added soft attention layers on each branch to generate the soft attention maps.The proposed model is validated on the CASIA-B database and findings indicate that part-based learning through triplet-branch CNN shows good performance of 72.98%under covariate conditions as well as also outperforms single-branch CNN models.
基金funded by the Center for Unmanned Aircraft Systems(C-UAS)a National Science Foundation Industry/University Cooperative Research Center(I/UCRC)under NSF award Numbers IIP-1161036 and CNS-1650547along with significant contributions from C-UAS industry members。
文摘This paper introduces a new algorithm for estimating the relative pose of a moving camera using consecutive frames of a video sequence. State-of-the-art algorithms for calculating the relative pose between two images use matching features to estimate the essential matrix. The essential matrix is then decomposed into the relative rotation and normalized translation between frames. To be robust to noise and feature match outliers, these methods generate a large number of essential matrix hypotheses from randomly selected minimal subsets of feature pairs, and then score these hypotheses on all feature pairs. Alternatively, the algorithm introduced in this paper calculates relative pose hypotheses by directly optimizing the rotation and normalized translation between frames, rather than calculating the essential matrix and then performing the decomposition. The resulting algorithm improves computation time by an order of magnitude. If an inertial measurement unit(IMU) is available, it is used to seed the optimizer, and in addition, we reuse the best hypothesis at each iteration to seed the optimizer thereby reducing the number of relative pose hypotheses that must be generated and scored. These advantages greatly speed up performance and enable the algorithm to run in real-time on low cost embedded hardware. We show application of our algorithm to visual multi-target tracking(MTT) in the presence of parallax and demonstrate its real-time performance on a 640 × 480 video sequence captured on a UAV. Video results are available at https://youtu.be/Hh K-p2 h XNn U.