基于平行全维动态注意力机制的视觉地点识别方法

Visual place recognition method based on parallel omni-dimensional dynamic attention mechanism

下载PDF

导出

摘要针对天气、季节、光线等环境变化导致的视觉地点识别鲁棒性低的问题,提出了一种提升视觉地点识别特征描述子环境稳健性的多维度注意力机制——平行全维动态注意力机制(POD-Attention)。为实现卷积核在全维度上的动态精细探索,增强特征提取网络对建筑物等不变性特征的提取与学习能力,采用全维动态卷积块在卷积核全维度(输入输出通道、卷积空间和卷积核数量)上添加互补性注意力。将1×1卷积、Skip Squeeze-and-Excitation(SSE)模块与全维动态卷积块平行融合,不仅有效提高了特征提取速率,还扩大了视觉地点识别网络的感受野,进一步提升了视觉地点的识别准确率。在公开数据集上进行的实验表明,基于VGG16及Patch-NetVLAD特征聚合的视觉地点识别方法经POD注意力机制改进后,在Nordland与Mapillary Street-Level Sequences数据集上的Recall@1指标提升了9.7%与1.8%,充分证明了本文POD注意力机制对于网络性能的提升效果,也证明了基于本文POD注意力机制的视觉地点识别方法的有效性。 To address the issue of low robustness in visual place recognition due to environmental changes like weather,season and lighting,we propose a solution called parallel omnidimensional dynamic attention(PODAttention).In order to achieve dynamic and fine-grained exploration of convolutional kernels across all dimensions and enhance the feature extraction network’s ability to capture invariant features like buildings,a complementary attention mechanism is incorporated into the omni-dimensional dynamic convolutional block.This mechanism operates on all dimensions of the convolutional kernels,including input/output channels,convolutional space and kernel quantity,enabling comprehensive attention across the entire kernel space.Furthermore,the parallel fusion of the 1×1 convolution,skip squeeze-and-excitation(SSE)module and omni-dimensional dynamic convolutional block yields notable benefits in terms of both feature extraction speed and the expansion of the receptive field within the visual place recognition network.By combining these components in parallel,the network gains the ability to capture more comprehensive information,resulting in enhanced accuracy for visual place recognition tasks.Experiments conducted on public datasets show that the visual place recognition method based on VGG16 and Patch-NetVLAD feature aggregation improved by the POD attention mechanism,achieves 9.7%increase in Recall@1 on the Nordland dataset and 1.8%increase on the Mapillary Street-Level Sequences dataset.These results demonstrate that the proposed POD attention mechanism effectively enhances the robustness of visual place recognition in different environmental conditions,laying a foundation for more accurate visual localization and map construction in visual SLAM.

作者刘沛津刘淑婕何林彭莉峻付雪峰 LIU Peijin;LIU Shujie;HE Lin;PENG Lijun;FU Xuefeng(School of Mechanical and Electrical Engineering,Xi'an University of Architecture&Technology,Xi'an 710055,China;Faculty of Science,Xi'an University of Architecture&Technology,Xi'an 710055,China)

机构地区西安建筑科技大学机电工程学院西安建筑科技大学理学院

出处《液晶与显示》 CAS CSCD 北大核心 2024年第9期1233-1242,共10页 Chinese Journal of Liquid Crystals and Displays

基金陕西省重点研发计划(No.2022GY-134) 陕西省教育厅专项科研项目(No.21JK0732) 西安建筑科技大学自然科学专项(No.ZR19058)。

关键词视觉地点识别环境鲁棒性深度学习平行全维动态注意力机制平行策略 visual place recognition environmental robustness deep learning parallel omni-dimensional dynamic attention parallel strategy

分类号 TP391.9 [自动化与计算机技术—计算机应用技术] TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1丁杰,田伦,刘亚.水乡地区城市道路网规划[J].交通与运输,2020,36(2):48-51.
2王德亮,蒋元群.基于对话句法的汉语自闭症儿童会话研究[J].天津外国语大学学报,2022,29(4):87-101. 被引量：1
3谢芮.当代中国翻译理论概述--引入、继承与融合[J].重庆第二师范学院学报,2020,33(5):40-44.
4李凡,王琛玭,陈玲,王艳娜,傅煜,宋文君.IETA共识定义的超声征象在子宫内膜癌中的应用[J].中国中西医结合影像学杂志,2024,22(5):585-588.
5沈晔湖,李欢,张大庆,苗洋,赵冲,蒋全胜.基于网络权重参数敏感度分析的终身视觉回环检测方法[J].中国机械工程,2024,35(7):1212-1221.
6张昱,张曼月,毕赟,赵雅丽,宋精梅,张宇燕.黄芪、川芎不同比例配伍对黄酮类成分提取动力学的影响[J].中国现代应用药学,2024,41(9):1192-1197.
7李韬,朱文忠,车璇.基于改进ResNet-50与迁移学习的苹果叶片病害的图像识别[J].科学技术与工程,2024,24(24):10370-10381.
8余珊珊,王哲,陈靖,陆跃翔.空气气氛下光催化辅助提铀技术研究进展[J].核化学与放射化学,2024,46(4):314-324.
9郭立新,毕素涛,赵明扬.基于改进YOLOv4轻量化网络的机械手状态检测算法[J].东北大学学报（自然科学版）,2024,45(6):769-775.
10施鸥玲,谭妍妍,武晓,龙雪彬,秦舒浩.高导电性PVDF/MWCNTs-AgNWs@MXene双层三维网络的电磁屏蔽复合薄膜的构建[J].复合材料学报,2024,41(8):4200-4210.

液晶与显示

2024年第9期

浏览历史

内容加载中请稍等...

基于平行全维动态注意力机制的视觉地点识别方法

相关作者

相关机构

相关主题

浏览历史