Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed ...Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed semantic SLAM,which combines object detection,semantic segmentation,instance segmentation,and visual SLAM.Despite the growing body of literature on semantic SLAM,there is currently a lack of comprehensive research on the integration of object detection and visual SLAM.Therefore,this study aims to gather information from multiple databases and review relevant literature using specific keywords.It focuses on visual SLAM based on object detection,covering different aspects.Firstly,it discusses the current research status and challenges in this field,highlighting methods for incorporating semantic information from object detection networks into mileage measurement,closed-loop detection,and map construction.It also compares the characteristics and performance of various visual SLAM object detection algorithms.Lastly,it provides an outlook on future research directions and emerging trends in visual SLAM.Research has shown that visual SLAM based on object detection has significant improvements compared to traditional SLAM in dynamic point removal,data association,point cloud segmentation,and other technologies.It can improve the robustness and accuracy of the entire SLAM system and can run in real time.With the continuous optimization of algorithms and the improvement of hardware level,object visual SLAM has great potential for development.展开更多
A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on...A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on the static-world assumption.To cope with this challenging topic,a real-time and robust VSLAM system based on ORB-SLAM2 for dynamic environments was proposed.To reduce the influence of dynamic content,we incorporate the deep-learning-based object detection method in the visual odometry,then the dynamic object probability model is added to raise the efficiency of object detection deep neural network and enhance the real-time performance of our system.Experiment with both on the TUM and KITTI benchmark dataset,as well as in a real-world environment,the results clarify that our method can significantly reduce the tracking error or drift,enhance the robustness,accuracy and stability of the VSLAM system in dynamic scenes.展开更多
This article presents a brief survey to visual simultaneous localization and mapping (SLAM) systems applied to multiple independently moving agents, such as a team of ground or aerial vehicles, a group of users holdin...This article presents a brief survey to visual simultaneous localization and mapping (SLAM) systems applied to multiple independently moving agents, such as a team of ground or aerial vehicles, a group of users holding augmented or virtual reality devices. Such visual SLAM system, name as collaborative visual SLAM, is different from a typical visual SLAM deployed on a single agent in that information is exchanged or shared among different agents to achieve better robustness, efficiency, and accuracy. We review the representative works on this topic proposed in the past ten years and describe the key components involved in designing such a system including collaborative pose estimation and mapping tasks, as well as the emerging topic of decentralized architecture. We believe this brief survey could be helpful to someone who are working on this topic or developing multi-agent applications, particularly micro-aerial vehicle swarm or collaborative augmented/virtual reality.展开更多
Feature selection is always an important issue in the visual SLAM (simultaneous location and mapping) literature. Considering that the location estimation can be improved by tracking features with larger value of vi...Feature selection is always an important issue in the visual SLAM (simultaneous location and mapping) literature. Considering that the location estimation can be improved by tracking features with larger value of visible time, a new feature selection method based on motion estimation is proposed. First, a k-step iteration algorithm is presented for visible time estimation using an affme motion model; then a delayed feature detection method is introduced for efficiently detecting features with the maximum visible time. As a means of validation for the proposed method, both simulation and real data experiments are carded out. Results show that the proposed method can improve both the estimation performance and the computational performance compared with the existing random feature selection method.展开更多
目前的同时定位与地图构建(Simultaneous Localization and Mapping,SLAM)研究大多是基于静态场景的假设,而实际生活中动态物体是不可避免的。在视觉SLAM系统中加入深度学习,可以协同剔除场景中的动态物体,有效提升视觉SLAM在动态环境...目前的同时定位与地图构建(Simultaneous Localization and Mapping,SLAM)研究大多是基于静态场景的假设,而实际生活中动态物体是不可避免的。在视觉SLAM系统中加入深度学习,可以协同剔除场景中的动态物体,有效提升视觉SLAM在动态环境下的鲁棒性。文章首先介绍了动态环境下基于深度学习的视觉SLAM分类,然后详细介绍了基于目标检测、基于语义分割和基于实例分割的视觉SLAM,并对它们进行了分析比较。最后,结合近年来视觉SLAM的发展趋势,通过对动态环境下基于深度学习的视觉SLAM存在的主要问题进行分析,总结了未来可能的发展方向。展开更多
目的移动智能体在执行同步定位与地图构建(Simultaneous Localization and Mapping,SLAM)的复杂任务时,动态物体的干扰会导致特征点间的关联减弱,系统定位精度下降,为此提出一种面向室内动态场景下基于YOLOv5和几何约束的视觉SLAM算法...目的移动智能体在执行同步定位与地图构建(Simultaneous Localization and Mapping,SLAM)的复杂任务时,动态物体的干扰会导致特征点间的关联减弱,系统定位精度下降,为此提出一种面向室内动态场景下基于YOLOv5和几何约束的视觉SLAM算法。方法首先,以YOLOv5s为基础,将原有的CSPDarknet主干网络替换成轻量级的MobileNetV3网络,可以减少参数、加快运行速度,同时与ORB-SLAM2系统相结合,在提取ORB特征点的同时获取语义信息,并剔除先验的动态特征点。然后,结合光流法和对极几何约束对可能残存的动态特征点进一步剔除。最后,仅用静态特征点对相机位姿进行估计。结果在TUM数据集上的实验结果表明,与ORB-SLAM2相比,在高动态序列下的ATE和RPE都减少了90%以上,与DS-SLAM、Dyna-SLAM同类型系统相比,在保证定位精度和鲁棒性的同时,跟踪线程中处理一帧图像平均只需28.26 ms。结论该算法能够有效降低动态物体对实时SLAM过程造成的干扰,为实现更加智能化、自动化的包装流程提供了可能。展开更多
针对传统SLAM算法在动态环境中会受到动态特征点的影响,导致算法定位精度下降的问题,提出了一种融合语义信息的视觉惯性SLAM算法SF-VINS(visual inertial navigation system based on semantics fusion)。首先基于VINS-Mono算法框架,将...针对传统SLAM算法在动态环境中会受到动态特征点的影响,导致算法定位精度下降的问题,提出了一种融合语义信息的视觉惯性SLAM算法SF-VINS(visual inertial navigation system based on semantics fusion)。首先基于VINS-Mono算法框架,将语义分割网络PP-LiteSeg集成到系统前端,并根据语义分割结果去除动态特征点;其次,在后端利用像素语义概率构建语义概率误差约束项,并使用特征点自适应权重,提出了新的BA代价函数和相机外参优化策略,提高了状态估计的准确度;最后,为验证该算法的有效性,在VIODE和NTU VIRAL数据集上进行实验。实验结果表明,与目前先进的视觉惯性SLAM算法相比,该算法在动态场景和静态场景的定位精度和鲁棒性均有一定优势。展开更多
由于视觉SLAM(Simultaneous Localization and Mapping)算法研究多建立于静态环境中,使得在动态环境下的应用造成较大定位偏移,极大降低了系统的稳定性。针对该问题,该文在原有视觉SLAM算法的基础上结合深度学习方法,对环境可能存在的...由于视觉SLAM(Simultaneous Localization and Mapping)算法研究多建立于静态环境中,使得在动态环境下的应用造成较大定位偏移,极大降低了系统的稳定性。针对该问题,该文在原有视觉SLAM算法的基础上结合深度学习方法,对环境可能存在的动态目标进行特征点剔除,从而提升系统在动态环境下的鲁棒性。采用的视觉SLAM系统为ORB-SLAM3,深度学习方法为YOLOv5的实例分割算法,采用对目标模型mask轮廓内特征点的检测算法及多视角几何方法进行特征点剔除。首先利用并行通信,将SLAM系统获取到的帧数据传入YOLOv5系统中进行可能为动态目标的分割,然后将其分割结果传回SLAM系统进行跟踪建图。同时改进词袋加载模型,提升加载速度,最终构建动态环境的稠密地图,具备可靠的实时性。通过在TUM数据集上的实验评估,该方法对比原SLAM框架及现阶段经典动态环境研究均有提升,其在保证平均帧率不降低的前提下精度较ORB-SLAM3的RMSE平均提升近89%。实验结果表明,对动态环境下的视觉SLAM算法有效改进,极大提升了系统的鲁棒性及稳定性。展开更多
基金the National Natural Science Foundation of China(No.62063006)to the Natural Science Foundation of Guangxi Province(No.2023GXNS-FAA026025)+3 种基金to the Innovation Fund of Chinese Universities Industry-University-Research(ID:2021RYC06005)to the Research Project for Young and Middle-aged Teachers in Guangxi Universities(ID:2020KY15013)to the Special Research Project of Hechi University(ID:2021GCC028)supported by the Project of Outstanding Thousand Young Teachers’Training in Higher Education Institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region.
文摘Visual simultaneous localization and mapping(SLAM)is crucial in robotics and autonomous driving.However,traditional visual SLAM faces challenges in dynamic environments.To address this issue,researchers have proposed semantic SLAM,which combines object detection,semantic segmentation,instance segmentation,and visual SLAM.Despite the growing body of literature on semantic SLAM,there is currently a lack of comprehensive research on the integration of object detection and visual SLAM.Therefore,this study aims to gather information from multiple databases and review relevant literature using specific keywords.It focuses on visual SLAM based on object detection,covering different aspects.Firstly,it discusses the current research status and challenges in this field,highlighting methods for incorporating semantic information from object detection networks into mileage measurement,closed-loop detection,and map construction.It also compares the characteristics and performance of various visual SLAM object detection algorithms.Lastly,it provides an outlook on future research directions and emerging trends in visual SLAM.Research has shown that visual SLAM based on object detection has significant improvements compared to traditional SLAM in dynamic point removal,data association,point cloud segmentation,and other technologies.It can improve the robustness and accuracy of the entire SLAM system and can run in real time.With the continuous optimization of algorithms and the improvement of hardware level,object visual SLAM has great potential for development.
基金the National Natural Science Foundation of China(No.61671470).
文摘A great number of visual simultaneous localization and mapping(VSLAM)systems need to assume static features in the environment.However,moving objects can vastly impair the performance of a VSLAM system which relies on the static-world assumption.To cope with this challenging topic,a real-time and robust VSLAM system based on ORB-SLAM2 for dynamic environments was proposed.To reduce the influence of dynamic content,we incorporate the deep-learning-based object detection method in the visual odometry,then the dynamic object probability model is added to raise the efficiency of object detection deep neural network and enhance the real-time performance of our system.Experiment with both on the TUM and KITTI benchmark dataset,as well as in a real-world environment,the results clarify that our method can significantly reduce the tracking error or drift,enhance the robustness,accuracy and stability of the VSLAM system in dynamic scenes.
基金Project Grant JZX7Y2-0190258055601National Natural Science Foundation of China(61402283).
文摘This article presents a brief survey to visual simultaneous localization and mapping (SLAM) systems applied to multiple independently moving agents, such as a team of ground or aerial vehicles, a group of users holding augmented or virtual reality devices. Such visual SLAM system, name as collaborative visual SLAM, is different from a typical visual SLAM deployed on a single agent in that information is exchanged or shared among different agents to achieve better robustness, efficiency, and accuracy. We review the representative works on this topic proposed in the past ten years and describe the key components involved in designing such a system including collaborative pose estimation and mapping tasks, as well as the emerging topic of decentralized architecture. We believe this brief survey could be helpful to someone who are working on this topic or developing multi-agent applications, particularly micro-aerial vehicle swarm or collaborative augmented/virtual reality.
文摘Feature selection is always an important issue in the visual SLAM (simultaneous location and mapping) literature. Considering that the location estimation can be improved by tracking features with larger value of visible time, a new feature selection method based on motion estimation is proposed. First, a k-step iteration algorithm is presented for visible time estimation using an affme motion model; then a delayed feature detection method is introduced for efficiently detecting features with the maximum visible time. As a means of validation for the proposed method, both simulation and real data experiments are carded out. Results show that the proposed method can improve both the estimation performance and the computational performance compared with the existing random feature selection method.
文摘目前的同时定位与地图构建(Simultaneous Localization and Mapping,SLAM)研究大多是基于静态场景的假设,而实际生活中动态物体是不可避免的。在视觉SLAM系统中加入深度学习,可以协同剔除场景中的动态物体,有效提升视觉SLAM在动态环境下的鲁棒性。文章首先介绍了动态环境下基于深度学习的视觉SLAM分类,然后详细介绍了基于目标检测、基于语义分割和基于实例分割的视觉SLAM,并对它们进行了分析比较。最后,结合近年来视觉SLAM的发展趋势,通过对动态环境下基于深度学习的视觉SLAM存在的主要问题进行分析,总结了未来可能的发展方向。
文摘目的移动智能体在执行同步定位与地图构建(Simultaneous Localization and Mapping,SLAM)的复杂任务时,动态物体的干扰会导致特征点间的关联减弱,系统定位精度下降,为此提出一种面向室内动态场景下基于YOLOv5和几何约束的视觉SLAM算法。方法首先,以YOLOv5s为基础,将原有的CSPDarknet主干网络替换成轻量级的MobileNetV3网络,可以减少参数、加快运行速度,同时与ORB-SLAM2系统相结合,在提取ORB特征点的同时获取语义信息,并剔除先验的动态特征点。然后,结合光流法和对极几何约束对可能残存的动态特征点进一步剔除。最后,仅用静态特征点对相机位姿进行估计。结果在TUM数据集上的实验结果表明,与ORB-SLAM2相比,在高动态序列下的ATE和RPE都减少了90%以上,与DS-SLAM、Dyna-SLAM同类型系统相比,在保证定位精度和鲁棒性的同时,跟踪线程中处理一帧图像平均只需28.26 ms。结论该算法能够有效降低动态物体对实时SLAM过程造成的干扰,为实现更加智能化、自动化的包装流程提供了可能。
文摘针对传统SLAM算法在动态环境中会受到动态特征点的影响,导致算法定位精度下降的问题,提出了一种融合语义信息的视觉惯性SLAM算法SF-VINS(visual inertial navigation system based on semantics fusion)。首先基于VINS-Mono算法框架,将语义分割网络PP-LiteSeg集成到系统前端,并根据语义分割结果去除动态特征点;其次,在后端利用像素语义概率构建语义概率误差约束项,并使用特征点自适应权重,提出了新的BA代价函数和相机外参优化策略,提高了状态估计的准确度;最后,为验证该算法的有效性,在VIODE和NTU VIRAL数据集上进行实验。实验结果表明,与目前先进的视觉惯性SLAM算法相比,该算法在动态场景和静态场景的定位精度和鲁棒性均有一定优势。
文摘由于视觉SLAM(Simultaneous Localization and Mapping)算法研究多建立于静态环境中,使得在动态环境下的应用造成较大定位偏移,极大降低了系统的稳定性。针对该问题,该文在原有视觉SLAM算法的基础上结合深度学习方法,对环境可能存在的动态目标进行特征点剔除,从而提升系统在动态环境下的鲁棒性。采用的视觉SLAM系统为ORB-SLAM3,深度学习方法为YOLOv5的实例分割算法,采用对目标模型mask轮廓内特征点的检测算法及多视角几何方法进行特征点剔除。首先利用并行通信,将SLAM系统获取到的帧数据传入YOLOv5系统中进行可能为动态目标的分割,然后将其分割结果传回SLAM系统进行跟踪建图。同时改进词袋加载模型,提升加载速度,最终构建动态环境的稠密地图,具备可靠的实时性。通过在TUM数据集上的实验评估,该方法对比原SLAM框架及现阶段经典动态环境研究均有提升,其在保证平均帧率不降低的前提下精度较ORB-SLAM3的RMSE平均提升近89%。实验结果表明,对动态环境下的视觉SLAM算法有效改进,极大提升了系统的鲁棒性及稳定性。