Dynamic Simultaneous Localization and Mapping(SLAM)in visual scenes is currently a major research area in fields such as robot navigation and autonomous driving.However,in the face of complex real-world envi-ronments,...Dynamic Simultaneous Localization and Mapping(SLAM)in visual scenes is currently a major research area in fields such as robot navigation and autonomous driving.However,in the face of complex real-world envi-ronments,current dynamic SLAM systems struggle to achieve precise localization and map construction.With the advancement of deep learning,there has been increasing interest in the development of deep learning-based dynamic SLAM visual odometry in recent years,and more researchers are turning to deep learning techniques to address the challenges of dynamic SLAM.Compared to dynamic SLAM systems based on deep learning methods such as object detection and semantic segmentation,dynamic SLAM systems based on instance segmentation can not only detect dynamic objects in the scene but also distinguish different instances of the same type of object,thereby reducing the impact of dynamic objects on the SLAM system’s positioning.This article not only introduces traditional dynamic SLAM systems based on mathematical models but also provides a comprehensive analysis of existing instance segmentation algorithms and dynamic SLAM systems based on instance segmentation,comparing and summarizing their advantages and disadvantages.Through comparisons on datasets,it is found that instance segmentation-based methods have significant advantages in accuracy and robustness in dynamic environments.However,the real-time performance of instance segmentation algorithms hinders the widespread application of dynamic SLAM systems.In recent years,the rapid development of single-stage instance segmentationmethods has brought hope for the widespread application of dynamic SLAM systems based on instance segmentation.Finally,possible future research directions and improvementmeasures are discussed for reference by relevant professionals.展开更多
In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camer...In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camera motion of two consecutive RGB-D frames by minimizing the photometric error.To permit outliers and noise,a robust sensor model built upon the t-distribution and an error function mixing depth and photometric errors are used to enhance the accuracy and robustness.Local graph optimization based on key frames is used to reduce the accumulative error and refine the local map.The loop closure detection method,which combines the appearance similarity method and spatial location constraints method,increases the speed of detection.Experimental results demonstrate that the proposed approach achieves higher accuracy on the motion estimation and environment reconstruction compared to the other state-of-the-art methods. Moreover,the proposed approach works in real-time on a laptop without a GPU,which makes it attractive for robots equipped with limited computational resources.展开更多
Simultaneous localisation and mapping(SLAM)are the basis for many robotic applications.As the front end of SLAM,visual odometry is mainly used to estimate camera pose.In dynamic scenes,classical methods are deteriorat...Simultaneous localisation and mapping(SLAM)are the basis for many robotic applications.As the front end of SLAM,visual odometry is mainly used to estimate camera pose.In dynamic scenes,classical methods are deteriorated by dynamic objects and cannot achieve satisfactory results.In order to improve the robustness of visual odometry in dynamic scenes,this paper proposed a dynamic region detection method based on RGBD images.Firstly,all feature points on the RGB image are classified as dynamic and static using a triangle constraint and the epipolar geometric constraint successively.Meanwhile,the depth image is clustered using the K-Means method.The classified feature points are mapped to the clustered depth image,and a dynamic or static label is assigned to each cluster according to the number of dynamic feature points.Subsequently,a dynamic region mask for the RGB image is generated based on the dynamic clusters in the depth image,and the feature points covered by the mask are all removed.The remaining static feature points are applied to estimate the camera pose.Finally,some experimental results are provided to demonstrate the feasibility and performance.展开更多
Error or drift is frequently produced in pose estimation based on geometric"feature detection and tracking"monocular visual odometry(VO)when the speed of camera movement exceeds 1.5 m/s.While,in most VO meth...Error or drift is frequently produced in pose estimation based on geometric"feature detection and tracking"monocular visual odometry(VO)when the speed of camera movement exceeds 1.5 m/s.While,in most VO methods based on deep learning,weight factors are in the form of fixed values,which are easy to lead to overfitting.A new measurement system,for monocular visual odometry,named Deep Learning Visual Odometry(DLVO),is proposed based on neural network.In this system,Convolutional Neural Network(CNN)is used to extract feature and perform feature matching.Moreover,Recurrent Neural Network(RNN)is used for sequence modeling to estimate camera’s 6-dof poses.Instead of fixed weight values of CNN,Bayesian distribution of weight factors are introduced in order to effectively solve the problem of network overfitting.The 18,726 frame images in KITTI dataset are used for training network.This system can increase the generalization ability of network model in prediction process.Compared with original Recurrent Convolutional Neural Network(RCNN),our method can reduce the loss of test model by 5.33%.And it’s an effective method in improving the robustness of translation and rotation information than traditional VO methods.展开更多
There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can ...There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.展开更多
Estimating the global position of a road vehicle without using GPS is a challenge that many scientists look forward to solving in the near future. Normally, inertial and odometry sensors are used to complement GPS mea...Estimating the global position of a road vehicle without using GPS is a challenge that many scientists look forward to solving in the near future. Normally, inertial and odometry sensors are used to complement GPS measures in an attempt to provide a means for maintaining vehicle odometry during GPS outage. Nonetheless, recent experiments have demonstrated that computer vision can also be used as a valuable source to provide what can be denoted as visual odometry. For this purpose, vehicle motion can be estimated using a non-linear, photogrametric approach based on RAndom SAmple Consensus (RANSAC). The results prove that the detection and selection of relevant feature points is a crucial factor in the global performance of the visual odometry algorithm. The key issues for further improvement are discussed in this letter.展开更多
Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly dist...Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.展开更多
Robust and efficient vision systems are essential in such a way to support different kinds of autonomous robotic behaviors linked to the capability to interact with the surrounding environment, without relying on any ...Robust and efficient vision systems are essential in such a way to support different kinds of autonomous robotic behaviors linked to the capability to interact with the surrounding environment, without relying on any a priori knowledge. Within space missions, above all those involving rovers that have to explore planetary surfaces, vision can play a key role in the improvement of autonomous navigation functionalities: besides obstacle avoidance and hazard detection along the traveling, vision can in fact provide accurate motion estimation in order to constantly monitor all paths executed by the rover. The present work basically regards the development of an effective visual odometry system, focusing as much as possible on issues such as continuous operating mode, system speed and reliability.展开更多
Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed ...Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed semantic SLAM,which combines object detection,semantic segmentation,instance segmentation,and visual SLAM.Despite the growing body of literature on semantic SLAM,there is currently a lack of comprehensive research on the integration of object detection and visual SLAM.Therefore,this study aims to gather information from multiple databases and review relevant literature using specific keywords.It focuses on visual SLAM based on object detection,covering different aspects.Firstly,it discusses the current research status and challenges in this field,highlighting methods for incorporating semantic information from object detection networks into mileage measurement,closed-loop detection,and map construction.It also compares the characteristics and performance of various visual SLAM object detection algorithms.Lastly,it provides an outlook on future research directions and emerging trends in visual SLAM.Research has shown that visual SLAM based on object detection has significant improvements compared to traditional SLAM in dynamic point removal,data association,point cloud segmentation,and other technologies.It can improve the robustness and accuracy of the entire SLAM system and can run in real time.With the continuous optimization of algorithms and the improvement of hardware level,object visual SLAM has great potential for development.展开更多
针对利用平面特征计算RGB-D相机位姿时的求解退化问题,提出平面和直线融合的RGB-D视觉里程计(Plane-line-based RGB-D visual odometry,PLVO).首先,提出基于平面-直线混合关联图(Plane-line hybrid association graph,PLHAG)的多特征关...针对利用平面特征计算RGB-D相机位姿时的求解退化问题,提出平面和直线融合的RGB-D视觉里程计(Plane-line-based RGB-D visual odometry,PLVO).首先,提出基于平面-直线混合关联图(Plane-line hybrid association graph,PLHAG)的多特征关联方法,充分考虑平面和平面、平面和直线之间的几何关系,对平面和直线两类几何特征进行一体化关联.然后,提出基于平面和直线主辅相济、自适应融合的RGB-D相机位姿估计方法.具体来说,鉴于平面特征通常比直线特征具有更好的准确性和稳定性,通过自适应加权的方法,确保平面特征在位姿计算中的主导作用,而对平面特征无法约束的位姿自由度(Degree of freedom,DoF),使用直线特征进行补充,得到相机的6自由度位姿估计结果,从而实现两类特征的融合,解决了单纯使用平面特征求解位姿时的退化问题.最后,通过公开数据集上的定量实验以及真实室内环境下的机器人实验,验证了所提出方法的有效性.展开更多
Detection and recognition of a stairway as upstairs,downstairs and negative(e.g.,ladder,level ground)are the fundamentals of assisting the visually impaired to travel independently in unfamiliar environments.Previous ...Detection and recognition of a stairway as upstairs,downstairs and negative(e.g.,ladder,level ground)are the fundamentals of assisting the visually impaired to travel independently in unfamiliar environments.Previous studies have focused on using massive amounts of RGB-D scene data to train traditional machine learning(ML)-based models to detect and recognize stationary stairway and escalator stairway separately.Nevertheless,none of them consider jointly training these two similar but different datasets to achieve better performance.This paper applies an adversarial learning algorithm on the indicated unsupervised domain adaptation scenario to transfer knowledge learned from the labeled RGB-D escalator stairway dataset to the unlabeled RGB-D stationary dataset.By utilizing the developed method,a feedforward convolutional neural network(CNN)-based feature extractor with five convolution layers can achieve 100%classification accuracy on testing the labeled escalator stairway data distributions and 80.6%classification accuracy on testing the unlabeled stationary data distributions.The success of the developed approach is demonstrated for classifying stairway on these two domains with a limited amount of data.To further demonstrate the effectiveness of the proposed method,the same CNN model is evaluated without domain adaptation and the results are compared with those of the presented architecture.展开更多
随着移动机器人技术不断发展,里程计技术已经成为移动机器人实现环境感知的关键技术,其发展水平对提高机器人的自主化和智能化具有重要意义。首先,系统阐述了同步定位与地图构建(Simultaneous localization and mapping,SLAM)中激光SLA...随着移动机器人技术不断发展,里程计技术已经成为移动机器人实现环境感知的关键技术,其发展水平对提高机器人的自主化和智能化具有重要意义。首先,系统阐述了同步定位与地图构建(Simultaneous localization and mapping,SLAM)中激光SLAM和视觉SLAM的发展近况,阐述了经典SLAM框架及其数学描述,简要介绍了3类常见相机的相机模型及其视觉里程计的数学描述。其次,分别对传统视觉里程计和深度学习里程计的研究进展进行系统阐述。对比分析了近10年来各类里程计算法的优势与不足。另外,对比分析了7种常用数据集的性能。最后,从精度、鲁棒性、数据集、多模态等方面总结了里程计技术面临的问题,从提高算法实时性、鲁棒性等方面展望了视觉里程计的发展趋势为:更加智能化、小型化新型传感器的发展;与无监督学习融合;语义表达技术的提高;集群机器人协同技术的发展。展开更多
基金the National Natural Science Foundation of China(No.62063006)the Natural Science Foundation of Guangxi Province(No.2023GXNS-FAA026025)+3 种基金the Innovation Fund of Chinese Universities Industry-University-Research(ID:2021RYC06005)the Research Project for Young andMiddle-Aged Teachers in Guangxi Universi-ties(ID:2020KY15013)the Special Research Project of Hechi University(ID:2021GCC028)financially supported by the Project of Outstanding Thousand Young Teachers’Training in Higher Education Institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region.
文摘Dynamic Simultaneous Localization and Mapping(SLAM)in visual scenes is currently a major research area in fields such as robot navigation and autonomous driving.However,in the face of complex real-world envi-ronments,current dynamic SLAM systems struggle to achieve precise localization and map construction.With the advancement of deep learning,there has been increasing interest in the development of deep learning-based dynamic SLAM visual odometry in recent years,and more researchers are turning to deep learning techniques to address the challenges of dynamic SLAM.Compared to dynamic SLAM systems based on deep learning methods such as object detection and semantic segmentation,dynamic SLAM systems based on instance segmentation can not only detect dynamic objects in the scene but also distinguish different instances of the same type of object,thereby reducing the impact of dynamic objects on the SLAM system’s positioning.This article not only introduces traditional dynamic SLAM systems based on mathematical models but also provides a comprehensive analysis of existing instance segmentation algorithms and dynamic SLAM systems based on instance segmentation,comparing and summarizing their advantages and disadvantages.Through comparisons on datasets,it is found that instance segmentation-based methods have significant advantages in accuracy and robustness in dynamic environments.However,the real-time performance of instance segmentation algorithms hinders the widespread application of dynamic SLAM systems.In recent years,the rapid development of single-stage instance segmentationmethods has brought hope for the widespread application of dynamic SLAM systems based on instance segmentation.Finally,possible future research directions and improvementmeasures are discussed for reference by relevant professionals.
基金Supported by the National Natural Science Foundation of China(61501034)
文摘In this paper a semi-direct visual odometry and mapping system is proposed with a RGB-D camera,which combines the merits of both feature based and direct based methods.The presented system directly estimates the camera motion of two consecutive RGB-D frames by minimizing the photometric error.To permit outliers and noise,a robust sensor model built upon the t-distribution and an error function mixing depth and photometric errors are used to enhance the accuracy and robustness.Local graph optimization based on key frames is used to reduce the accumulative error and refine the local map.The loop closure detection method,which combines the appearance similarity method and spatial location constraints method,increases the speed of detection.Experimental results demonstrate that the proposed approach achieves higher accuracy on the motion estimation and environment reconstruction compared to the other state-of-the-art methods. Moreover,the proposed approach works in real-time on a laptop without a GPU,which makes it attractive for robots equipped with limited computational resources.
基金supported in part by the National Natural Science Foundation of China(Grant No.U1913201,U22B2041)Natural Science Foundation of Liaoning Province(Grant No.2019-ZD-0169).
文摘Simultaneous localisation and mapping(SLAM)are the basis for many robotic applications.As the front end of SLAM,visual odometry is mainly used to estimate camera pose.In dynamic scenes,classical methods are deteriorated by dynamic objects and cannot achieve satisfactory results.In order to improve the robustness of visual odometry in dynamic scenes,this paper proposed a dynamic region detection method based on RGBD images.Firstly,all feature points on the RGB image are classified as dynamic and static using a triangle constraint and the epipolar geometric constraint successively.Meanwhile,the depth image is clustered using the K-Means method.The classified feature points are mapped to the clustered depth image,and a dynamic or static label is assigned to each cluster according to the number of dynamic feature points.Subsequently,a dynamic region mask for the RGB image is generated based on the dynamic clusters in the depth image,and the feature points covered by the mask are all removed.The remaining static feature points are applied to estimate the camera pose.Finally,some experimental results are provided to demonstrate the feasibility and performance.
基金supported by National Key R&D Plan(2017YFB1301104),NSFC(61877040,61772351)Sci-Tech Innovation Fundamental Scientific Research Funds(025195305000)(19210010005),academy for multidisciplinary study of Capital Normal University。
文摘Error or drift is frequently produced in pose estimation based on geometric"feature detection and tracking"monocular visual odometry(VO)when the speed of camera movement exceeds 1.5 m/s.While,in most VO methods based on deep learning,weight factors are in the form of fixed values,which are easy to lead to overfitting.A new measurement system,for monocular visual odometry,named Deep Learning Visual Odometry(DLVO),is proposed based on neural network.In this system,Convolutional Neural Network(CNN)is used to extract feature and perform feature matching.Moreover,Recurrent Neural Network(RNN)is used for sequence modeling to estimate camera’s 6-dof poses.Instead of fixed weight values of CNN,Bayesian distribution of weight factors are introduced in order to effectively solve the problem of network overfitting.The 18,726 frame images in KITTI dataset are used for training network.This system can increase the generalization ability of network model in prediction process.Compared with original Recurrent Convolutional Neural Network(RCNN),our method can reduce the loss of test model by 5.33%.And it’s an effective method in improving the robustness of translation and rotation information than traditional VO methods.
基金supported by the NIBIB and the NEI of the National Institutes of Health(R01EB018117)。
文摘There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.
文摘Estimating the global position of a road vehicle without using GPS is a challenge that many scientists look forward to solving in the near future. Normally, inertial and odometry sensors are used to complement GPS measures in an attempt to provide a means for maintaining vehicle odometry during GPS outage. Nonetheless, recent experiments have demonstrated that computer vision can also be used as a valuable source to provide what can be denoted as visual odometry. For this purpose, vehicle motion can be estimated using a non-linear, photogrametric approach based on RAndom SAmple Consensus (RANSAC). The results prove that the detection and selection of relevant feature points is a crucial factor in the global performance of the visual odometry algorithm. The key issues for further improvement are discussed in this letter.
基金Supported by Tianjin Municipal Natural Science Foundation of China(Grant No.19JCJQJC61600)Hebei Provincial Natural Science Foundation of China(Grant Nos.F2020202051,F2020202053).
文摘Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.
文摘Robust and efficient vision systems are essential in such a way to support different kinds of autonomous robotic behaviors linked to the capability to interact with the surrounding environment, without relying on any a priori knowledge. Within space missions, above all those involving rovers that have to explore planetary surfaces, vision can play a key role in the improvement of autonomous navigation functionalities: besides obstacle avoidance and hazard detection along the traveling, vision can in fact provide accurate motion estimation in order to constantly monitor all paths executed by the rover. The present work basically regards the development of an effective visual odometry system, focusing as much as possible on issues such as continuous operating mode, system speed and reliability.
基金the National Natural Science Foundation of China(No.62063006)to the Natural Science Foundation of Guangxi Province(No.2023GXNS-FAA026025)+3 种基金to the Innovation Fund of Chinese Universities Industry-University-Research(ID:2021RYC06005)to the Research Project for Young and Middle-aged Teachers in Guangxi Universities(ID:2020KY15013)to the Special Research Project of Hechi University(ID:2021GCC028)supported by the Project of Outstanding Thousand Young Teachers’Training in Higher Education Institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region.
文摘Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed semantic SLAM,which combines object detection,semantic segmentation,instance segmentation,and visual SLAM.Despite the growing body of literature on semantic SLAM,there is currently a lack of comprehensive research on the integration of object detection and visual SLAM.Therefore,this study aims to gather information from multiple databases and review relevant literature using specific keywords.It focuses on visual SLAM based on object detection,covering different aspects.Firstly,it discusses the current research status and challenges in this field,highlighting methods for incorporating semantic information from object detection networks into mileage measurement,closed-loop detection,and map construction.It also compares the characteristics and performance of various visual SLAM object detection algorithms.Lastly,it provides an outlook on future research directions and emerging trends in visual SLAM.Research has shown that visual SLAM based on object detection has significant improvements compared to traditional SLAM in dynamic point removal,data association,point cloud segmentation,and other technologies.It can improve the robustness and accuracy of the entire SLAM system and can run in real time.With the continuous optimization of algorithms and the improvement of hardware level,object visual SLAM has great potential for development.
文摘针对利用平面特征计算RGB-D相机位姿时的求解退化问题,提出平面和直线融合的RGB-D视觉里程计(Plane-line-based RGB-D visual odometry,PLVO).首先,提出基于平面-直线混合关联图(Plane-line hybrid association graph,PLHAG)的多特征关联方法,充分考虑平面和平面、平面和直线之间的几何关系,对平面和直线两类几何特征进行一体化关联.然后,提出基于平面和直线主辅相济、自适应融合的RGB-D相机位姿估计方法.具体来说,鉴于平面特征通常比直线特征具有更好的准确性和稳定性,通过自适应加权的方法,确保平面特征在位姿计算中的主导作用,而对平面特征无法约束的位姿自由度(Degree of freedom,DoF),使用直线特征进行补充,得到相机的6自由度位姿估计结果,从而实现两类特征的融合,解决了单纯使用平面特征求解位姿时的退化问题.最后,通过公开数据集上的定量实验以及真实室内环境下的机器人实验,验证了所提出方法的有效性.
文摘Detection and recognition of a stairway as upstairs,downstairs and negative(e.g.,ladder,level ground)are the fundamentals of assisting the visually impaired to travel independently in unfamiliar environments.Previous studies have focused on using massive amounts of RGB-D scene data to train traditional machine learning(ML)-based models to detect and recognize stationary stairway and escalator stairway separately.Nevertheless,none of them consider jointly training these two similar but different datasets to achieve better performance.This paper applies an adversarial learning algorithm on the indicated unsupervised domain adaptation scenario to transfer knowledge learned from the labeled RGB-D escalator stairway dataset to the unlabeled RGB-D stationary dataset.By utilizing the developed method,a feedforward convolutional neural network(CNN)-based feature extractor with five convolution layers can achieve 100%classification accuracy on testing the labeled escalator stairway data distributions and 80.6%classification accuracy on testing the unlabeled stationary data distributions.The success of the developed approach is demonstrated for classifying stairway on these two domains with a limited amount of data.To further demonstrate the effectiveness of the proposed method,the same CNN model is evaluated without domain adaptation and the results are compared with those of the presented architecture.
文摘随着移动机器人技术不断发展,里程计技术已经成为移动机器人实现环境感知的关键技术,其发展水平对提高机器人的自主化和智能化具有重要意义。首先,系统阐述了同步定位与地图构建(Simultaneous localization and mapping,SLAM)中激光SLAM和视觉SLAM的发展近况,阐述了经典SLAM框架及其数学描述,简要介绍了3类常见相机的相机模型及其视觉里程计的数学描述。其次,分别对传统视觉里程计和深度学习里程计的研究进展进行系统阐述。对比分析了近10年来各类里程计算法的优势与不足。另外,对比分析了7种常用数据集的性能。最后,从精度、鲁棒性、数据集、多模态等方面总结了里程计技术面临的问题,从提高算法实时性、鲁棒性等方面展望了视觉里程计的发展趋势为:更加智能化、小型化新型传感器的发展;与无监督学习融合;语义表达技术的提高;集群机器人协同技术的发展。