Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance o...Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance of robotic applications in terms of accuracy and speed.This research proposed a real-time indoor camera localization system based on a recurrent neural network that detects scene change during the image sequence.An annotated image dataset trains the proposed system and predicts the camera pose in real-time.The system mainly improved the localization performance of indoor cameras by more accurately predicting the camera pose.It also recognizes the scene changes during the sequence and evaluates the effects of these changes.This system achieved high accuracy and real-time performance.The scene change detection process was performed using visual rhythm and the proposed recurrent deep architecture,which performed camera pose prediction and scene change impact evaluation.Overall,this study proposed a novel real-time localization system for indoor cameras that detects scene changes and shows how they affect localization performance.展开更多
There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can ...There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.展开更多
为提高ORB-SLAM2 (oriented fast and rotated brief, and simultaneous localization and mapping)系统的位姿估计精度并解决仅能生成稀疏地图的问题,提出了融合迭代最近点拟合(iterative closest point, ICP)算法与曼哈顿世界假说的...为提高ORB-SLAM2 (oriented fast and rotated brief, and simultaneous localization and mapping)系统的位姿估计精度并解决仅能生成稀疏地图的问题,提出了融合迭代最近点拟合(iterative closest point, ICP)算法与曼哈顿世界假说的位姿估计策略并在系统中加入稠密建图线程。首先通过ORB(oriented fast and rotated brief)特征点法、最小显著性差异(least-significant difference, LSD)算法和聚集层次聚类(agglomerative hierarchical clustering, AHC)方法提取点、线、面特征,其中点、线特征与上一帧匹配,面特征在全局地图匹配。然后采用基于surfel的稠密建图策略将图像划分为非平面与平面区域,非平面采用ICP算法计算位姿,平面则通过面与面的正交关系确定曼哈顿世界从而使用不同估计策略,其中曼哈顿世界场景通过位姿解耦实现基于曼哈顿帧观测的无漂移旋转估计,而该场景的平移以及非曼哈顿世界场景的位姿采用追踪的点、线、面特征进行估计和优化;最后根据关键帧和相应位姿实现稠密建图。采用慕尼黑工业大学(technische universit?t münchen, TUM)数据集验证所提建图方法,经过与ORB-SLAM2算法比较,均方根误差平均减少0.24 cm,平均定位精度提高7.17%,验证了所提方法进行稠密建图的可行性和有效性。展开更多
针对室外大范围场景移动机器人建图中,激光雷达里程计位姿计算不准确导致SLAM(simultaneous localization and mapping)算法精度下降的问题,提出一种基于多传感信息融合的SLAM语义词袋优化算法MSW-SLAM(multi-sensor information fusion...针对室外大范围场景移动机器人建图中,激光雷达里程计位姿计算不准确导致SLAM(simultaneous localization and mapping)算法精度下降的问题,提出一种基于多传感信息融合的SLAM语义词袋优化算法MSW-SLAM(multi-sensor information fusion SLAM based on semantic word bags)。采用视觉惯性系统引入激光雷达原始观测数据,并通过滑动窗口实现了IMU(inertia measurement unit)量测、视觉特征和激光点云特征的多源数据联合非线性优化;最后算法利用视觉与激光雷达的语义词袋互补特性进行闭环优化,进一步提升了多传感器融合SLAM系统的全局定位和建图精度。实验结果显示,相比于传统的紧耦合双目视觉惯性里程计和激光雷达里程计定位,MSW-SLAM算法能够有效探测轨迹中的闭环信息,并实现高精度的全局位姿图优化,闭环检测后的点云地图具有良好的分辨率和全局一致性。展开更多
为了改善在动态场景下同步定位与地图绘制(Simultaneous Localization And Mapping,SLAM)算法定位精度低的问题,提出一种基于轻量化YOLOv(You Only Look Once version)8n的动态视觉SLAM算法。利用加权双向特征金字塔网络(Bidirectional ...为了改善在动态场景下同步定位与地图绘制(Simultaneous Localization And Mapping,SLAM)算法定位精度低的问题,提出一种基于轻量化YOLOv(You Only Look Once version)8n的动态视觉SLAM算法。利用加权双向特征金字塔网络(Bidirectional Feature Pyramid Network,BiFPN)对YOLOv8n模型进行轻量化改进,减少其参数量。在SLAM算法中引入轻量化YOLOv8n模型,并结合稀疏光流法组成目标检测线程,以去除动态特征点,利用经过筛选的特征点进行特征匹配和位姿估计。实验结果表明:轻量化YOLOv8n模型参数量下降了36.7%,权重减少了33.3%,能够实现YOLOv8n模型的轻量化;与ORB-SLAM3算法相比,所提算法在动态场景下的定位精度提高83.38%,有效提高了动态场景下SLAM算法的精度。展开更多
在视觉同时定位与地图构建问题中,ORB(Oriented FAST and Rotated BRIEF)特征由于其高效、稳定的优点而受到广泛关注。针对ORB特征提取过程中存在的像点量测精度较低、特征聚集现象明显等问题,提出了一种适用于高精度SLAM的均衡化亚像素...在视觉同时定位与地图构建问题中,ORB(Oriented FAST and Rotated BRIEF)特征由于其高效、稳定的优点而受到广泛关注。针对ORB特征提取过程中存在的像点量测精度较低、特征聚集现象明显等问题,提出了一种适用于高精度SLAM的均衡化亚像素ORB特征提取方法。分析了精确特征定位的原理,对误差方程进行合理的简化并采用一种基于模板窗口距离的权函数计算方法,大幅降低了计算负担;设计了一种基于四叉树结构的特征均衡化方案,对包含特征的像平面空间进行有限次数的迭代分割,然后选取具有最优响应的特征。试验表明,本文方法进行特征提取的额外计算负担小于2.5 ms,在运行TUM和KITTI数据集时,ORB特征的量测精度分别为0.84和0.62 Pixel,达到亚像素水平,可以降低误差初值,提高光束法平差效率,并能够在满足特征总体分布规律的情况下,显著改善特征聚集的现象,有利于后续问题的稳健、准确求解。展开更多
文摘Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance of robotic applications in terms of accuracy and speed.This research proposed a real-time indoor camera localization system based on a recurrent neural network that detects scene change during the image sequence.An annotated image dataset trains the proposed system and predicts the camera pose in real-time.The system mainly improved the localization performance of indoor cameras by more accurately predicting the camera pose.It also recognizes the scene changes during the sequence and evaluates the effects of these changes.This system achieved high accuracy and real-time performance.The scene change detection process was performed using visual rhythm and the proposed recurrent deep architecture,which performed camera pose prediction and scene change impact evaluation.Overall,this study proposed a novel real-time localization system for indoor cameras that detects scene changes and shows how they affect localization performance.
基金supported by the NIBIB and the NEI of the National Institutes of Health(R01EB018117)。
文摘There are about 253 million people with visual impairment worldwide.Many of them use a white cane and/or a guide dog as the mobility tool for daily travel.Despite decades of efforts,electronic navigation aid that can replace white cane is still research in progress.In this paper,we propose an RGB-D camera based visual positioning system(VPS)for real-time localization of a robotic navigation aid(RNA)in an architectural floor plan for assistive navigation.The core of the system is the combination of a new 6-DOF depth-enhanced visual-inertial odometry(DVIO)method and a particle filter localization(PFL)method.DVIO estimates RNA’s pose by using the data from an RGB-D camera and an inertial measurement unit(IMU).It extracts the floor plane from the camera’s depth data and tightly couples the floor plane,the visual features(with and without depth data),and the IMU’s inertial data in a graph optimization framework to estimate the device’s 6-DOF pose.Due to the use of the floor plane and depth data from the RGB-D camera,DVIO has a better pose estimation accuracy than the conventional VIO method.To reduce the accumulated pose error of DVIO for navigation in a large indoor space,we developed the PFL method to locate RNA in the floor plan.PFL leverages geometric information of the architectural CAD drawing of an indoor space to further reduce the error of the DVIO-estimated pose.Based on VPS,an assistive navigation system is developed for the RNA prototype to assist a visually impaired person in navigating a large indoor space.Experimental results demonstrate that:1)DVIO method achieves better pose estimation accuracy than the state-of-the-art VIO method and performs real-time pose estimation(18 Hz pose update rate)on a UP Board computer;2)PFL reduces the DVIO-accrued pose error by 82.5%on average and allows for accurate wayfinding(endpoint position error≤45 cm)in large indoor spaces.
文摘为提高ORB-SLAM2 (oriented fast and rotated brief, and simultaneous localization and mapping)系统的位姿估计精度并解决仅能生成稀疏地图的问题,提出了融合迭代最近点拟合(iterative closest point, ICP)算法与曼哈顿世界假说的位姿估计策略并在系统中加入稠密建图线程。首先通过ORB(oriented fast and rotated brief)特征点法、最小显著性差异(least-significant difference, LSD)算法和聚集层次聚类(agglomerative hierarchical clustering, AHC)方法提取点、线、面特征,其中点、线特征与上一帧匹配,面特征在全局地图匹配。然后采用基于surfel的稠密建图策略将图像划分为非平面与平面区域,非平面采用ICP算法计算位姿,平面则通过面与面的正交关系确定曼哈顿世界从而使用不同估计策略,其中曼哈顿世界场景通过位姿解耦实现基于曼哈顿帧观测的无漂移旋转估计,而该场景的平移以及非曼哈顿世界场景的位姿采用追踪的点、线、面特征进行估计和优化;最后根据关键帧和相应位姿实现稠密建图。采用慕尼黑工业大学(technische universit?t münchen, TUM)数据集验证所提建图方法,经过与ORB-SLAM2算法比较,均方根误差平均减少0.24 cm,平均定位精度提高7.17%,验证了所提方法进行稠密建图的可行性和有效性。
文摘针对室外大范围场景移动机器人建图中,激光雷达里程计位姿计算不准确导致SLAM(simultaneous localization and mapping)算法精度下降的问题,提出一种基于多传感信息融合的SLAM语义词袋优化算法MSW-SLAM(multi-sensor information fusion SLAM based on semantic word bags)。采用视觉惯性系统引入激光雷达原始观测数据,并通过滑动窗口实现了IMU(inertia measurement unit)量测、视觉特征和激光点云特征的多源数据联合非线性优化;最后算法利用视觉与激光雷达的语义词袋互补特性进行闭环优化,进一步提升了多传感器融合SLAM系统的全局定位和建图精度。实验结果显示,相比于传统的紧耦合双目视觉惯性里程计和激光雷达里程计定位,MSW-SLAM算法能够有效探测轨迹中的闭环信息,并实现高精度的全局位姿图优化,闭环检测后的点云地图具有良好的分辨率和全局一致性。
文摘为了改善在动态场景下同步定位与地图绘制(Simultaneous Localization And Mapping,SLAM)算法定位精度低的问题,提出一种基于轻量化YOLOv(You Only Look Once version)8n的动态视觉SLAM算法。利用加权双向特征金字塔网络(Bidirectional Feature Pyramid Network,BiFPN)对YOLOv8n模型进行轻量化改进,减少其参数量。在SLAM算法中引入轻量化YOLOv8n模型,并结合稀疏光流法组成目标检测线程,以去除动态特征点,利用经过筛选的特征点进行特征匹配和位姿估计。实验结果表明:轻量化YOLOv8n模型参数量下降了36.7%,权重减少了33.3%,能够实现YOLOv8n模型的轻量化;与ORB-SLAM3算法相比,所提算法在动态场景下的定位精度提高83.38%,有效提高了动态场景下SLAM算法的精度。
文摘在视觉同时定位与地图构建问题中,ORB(Oriented FAST and Rotated BRIEF)特征由于其高效、稳定的优点而受到广泛关注。针对ORB特征提取过程中存在的像点量测精度较低、特征聚集现象明显等问题,提出了一种适用于高精度SLAM的均衡化亚像素ORB特征提取方法。分析了精确特征定位的原理,对误差方程进行合理的简化并采用一种基于模板窗口距离的权函数计算方法,大幅降低了计算负担;设计了一种基于四叉树结构的特征均衡化方案,对包含特征的像平面空间进行有限次数的迭代分割,然后选取具有最优响应的特征。试验表明,本文方法进行特征提取的额外计算负担小于2.5 ms,在运行TUM和KITTI数据集时,ORB特征的量测精度分别为0.84和0.62 Pixel,达到亚像素水平,可以降低误差初值,提高光束法平差效率,并能够在满足特征总体分布规律的情况下,显著改善特征聚集的现象,有利于后续问题的稳健、准确求解。