摘要
在可见光红外跟踪(RGB and Thermal Infrared Tracking,RGB-T)的研究中,为了在常规跟踪算法的基础上实现两个模态的有效融合,基于注意力机制提出了一种基于注意力交互的RGB-T跟踪算法。该算法引入注意力机制对可见光和红外两种模态的图像特征进行增强和融合,设计了自特征增强编码器对单一模态的特征进行增强,设计了互特征解码器对两个模态增强后的特征进行交互融合。编码器和解码器均采用两层注意力模块。为了减小算法模型的复杂度,对传统注意力模块进行简化,将全连接层改为1×1卷积。此外,该算法对多个卷积层的特征均进行分层融合,以充分挖掘各层卷积特征中的细节和语义信息。在GTOT,RGBT234和LasHeR三个数据集上进行对比测试。实验结果表明,所提算法性能优异,特别是在RGBT234和LasHeR这两个大规模数据集上取得了最优的跟踪结果,验证了注意力机制在RGB-T跟踪中的有效性。
In visible and thermal infrared tracking(RGB-T),to effectively merge these two modalities building on traditional tracking techniques,this study introduces an attention-based RGB-T tracking ap⁃proach based on the attention mechanism.This method employs the attention mechanism to augment and integrate features from both visible and infrared images.It features a self-feature enhancement encoder to boost single modality features,and a cross-feature interaction decoder for merging the enhanced features from both modalities.Both the encoder and decoder incorporate dual layers of attention modules.To streamline the network,the traditional attention module is simplified by substituting fully connected layers with 1×1 convolutions.Moreover,it merges features from various convolutional layers to thoroughly ex⁃plore details and semantic insights.Comparative experiments on three datasets—GTOT,RGBT234,and LasHeR—demonstrate that our method achieves superior tracking performance,underscoring the efficacy of the attention mechanism in RGB-T tracking.
作者
王暐
付飞亚
雷灏
唐自力
WANG Wei;FU Feiya;LEI Hao;TANG Zili(The 63870 Unit of PLA,Weinan 714299,China)
出处
《光学精密工程》
EI
CAS
CSCD
北大核心
2024年第3期435-444,共10页
Optics and Precision Engineering
基金
XX识别性能评估技术研究。
关键词
可见光红外跟踪
注意力机制
多模态特征融合
特征增强
RGB-T tracking
attention mechanism
feature fuse of multi-modality
feature enhancement