期刊文献+

基于全卷积编解码网络的视觉定位方法

Visual Localization with a Fully Convolutional Encoder-Decoder Network
原文传递
导出
摘要 针对目前视觉定位方法中使用人工特征的限制,提出了一种基于全卷积编解码网络的视觉定位方法。该方法将场景点3D坐标映射到图像的BGR(blue-green-red)通道,建立了图像到场景的直接联系,并通过全卷积编解码网络学习图像与场景结构的关系。给出一张图像,网络可以预测其每个像素点对应的3D点在当前场景世界坐标系的坐标;然后结合RANSAC(random sample consensus)和PnP(perspective-n-point)算法求解位姿并优化,得到最终的相对位姿。在7-Scenes数据集上的实验结果表明本文方法可实现厘米级的高精度定位,并且相比其他基于深度学习的方法,该方法在保证精度的同时,模型尺寸更小。 To address the limitations of using the hand-crafted feature in present visual localization methods,a method is proposed in this paper based on a fully convolutional encoderdecoder network for visual localization.Different from the previous approaches of scene construction,this method maps 3D scene coordinates to the BGR cube,thus directly establishes the connection between images and the structure of scene,and learns their conncection through the fully convolutional encoder-decoder network.Given an image,for each pixel in it,the network can infer the corresponding scene coordinate under the current scene′s world coordinate system.Then the final camera pose is obtained with RANSAC and PnP algorithm.Lastly,results on the 7-Scenes dataset indicate that the method can achieve highly accurate visual localization at centimeter-level.Besides,compared with other methods based on deep learning,the network has a smaller size model while ensuring accuracy.
作者 李晨旻 姚剑 龚烨 刘欣怡 LI Chenmin;YAO Jian;GONG Ye;LIU Xinyi(School of Remote Sensing and Information Engineering,Wuhan University,Wuhan 430079,China)
出处 《测绘地理信息》 CSCD 2022年第6期46-49,共4页 Journal of Geomatics
基金 国家自然科学基金(41571436)。
关键词 视觉定位 场景构建 姿态估计 深度学习 visual localization scene construction pose estimation deep learning
  • 相关文献

参考文献3

二级参考文献2

共引文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部