跨视角地理视觉定位

Cross-view Geo-visual Localization

下载PDF

导出

摘要伴随着智能终端设备的爆炸性增长和移动互联网的迅速崛起,在许多场景下,例如地广人稀的偏远山区,基于位置的服务需求越来越凸显。但由于这些区域GPS信号遮挡或信号基站难以覆盖,GPS定位无法正常发挥作用。图像地理定位指仅根据视觉信息确定图像的拍摄位置。在没有任何先验知识的情况下,预测照片的地理位置是一项非常艰巨的任务,因为不同条件下(例如,不同的天气,物体或相机设置)拍摄的图像会呈现出巨大的变化。文中尝试探索图像的跨视角地理视觉定位方法,首先利用逆极坐标转换将街景视角转换为空域视角图像,以此减少两者间的域差异,再利用深度学习的方法来对不同视角的图像进行编码以获得更加鲁棒的图像全局向量描述子,然后在此基础之上进行图像匹配和街景视角查询图像的定位。在图像特征提取方面,采用了VGG16模型,利用层数更深的小卷积核的方式去增大网络模型的感受视野并节省参数。在特征编码方面,将多尺度注意力机制融入NetVLAD模型,将骨架模型提取到的特征编码成更加鲁棒的全局特征描述子向量。实验结果显示,上述方法能够实现较高精度的街景视角的匹配与定位,同目前已有的方法相比,匹配精度更高。而且无须专业设备采集的高清街景视图,普通智能手机拍摄的街景视图即可获得较好的匹配定位精度。 With the explosive growth of smart terminal equipment and the rapid rise of mobile Internet,in many scenarios,such as indoor environments and remote mountainous areas with sparse population,the demand for location-based services has become more and more prominent.However,because GPS signals in these areas are blocked or the signal base stations are difficult to cover,GPS location can not working properly.Image based geo-location refers to determine the location of an image based only on visual information.Without any prior knowledge,predicting the geographic location of a photo is a very difficult task,because the images taken from the earth will show huge changes with different weather,objects or camera settings.This paper attempts to explore the cross-view geo-localization method.First,the inverse polar coordinate transformation is used to convert the street view perspective to the spatial perspective image,so as to reduce the domain gap between the two.Then deep learning is used to encode images from different perspectives to obtain more robust global vector descriptors.Finally,performing image matching on this basis.In the aspect of image feature extraction,the VGG16 model is adopted,and a smaller convolution kernel with deeper layers is used to increase the perception field of the network model and save parameters.In terms of feature encoding,the multi-scale attention mechanism is integrated into the NetVLAD model,and the features extracted from the backbone model are encoded into a more robust global feature descriptor vector.Experimental results show that the above-mentioned method can achieve higher accuracy,compared with the existing methods.And without the high-definition street view captured by professional equipment,the street view captured by ordinary smart phones can obtain good matching accuracy.

作者刘旭东余平 LIU Xudong;YU Ping(Wudong Colliery,CHN ENERGY,Urumqi 830000,China;Chn Energy Network Infomation Technology,Co.,Ltd.,Beijing 100011,China)

机构地区国家能源投资集团新疆能源有限责任公司乌东煤矿国能网信科技(北京)有限公司

出处《计算机科学》 CSCD 北大核心 2023年第S02期395-401,共7页 Computer Science

关键词跨视角定位逆极坐标系转换 NetVLAD 多尺度注意力 Cross-view geo-localization Inverse polar transform NetVLAD Multi-scale attention

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1王红霞.“智改数转”视阈下图像识别传感器在工业产品中的应用研究[J].中国高新科技,2023(13):50-51. 被引量：1
2吴俊杰,陈明霞,卢澎澎.胎面重量控制系统的CPSO优化Smith预估线性自抗扰策略[J].科学技术与工程,2023,23(14):6105-6112.
3青海省:办好老年助餐的为“关键小事”[J].中国社会工作,2023(29):14-15.
4张红文.基于主副词识别文本相似度的地名地址匹配方法研究[J].中文科技期刊数据库（全文版）自然科学,2022(3):238-240.
5张瑞坤.抖音短视频对内蒙古西部旅游形象的塑造与传播策略[J].传媒论坛,2023,6(15):84-87. 被引量：1
6王金花.观我国高校图书馆参与公共文化服务[J].文化产业,2023(31):55-57. 被引量：2
7张学军,杨依行,李佳乐,田丰,黄海燕,黄山.抵御背景信息推理攻击的假位置生成算法[J].计算机科学,2023,50(S02):867-875.
8江乾坤,董驰浩.MPAcc职业能力与职业素养提升机制研究[J].商业会计,2023(20):119-122. 被引量：4
9吴述国,易瑞瑞.浅析"投标文件实质性响应"问题及对策[J].招标采购管理,2023(9):32-34.
10张炜,王乃合,潘紫媗,刘威.全球经济失衡背景下经济分化与制度质量研究[J].国际经贸探索,2023,39(8):73-89. 被引量：1

计算机科学

2023年第S02期

浏览历史

内容加载中请稍等...

跨视角地理视觉定位

相关作者

相关机构

相关主题

浏览历史