摘要
提出一种跨模态光学信息交互和模板动态更新的可见光和热红外(RGBT)跟踪方法,选取能够在跟踪速度和精度上取得平衡的Siamese跟踪器作为基本框架,并设计特征交互模块以重构不同模态的信息比例和增强模态间信息交流。在此基础上,基于无锚框的思想构建预测网络,以提升跟踪器的灵活性和通用性,同时提出一种模板动态更新的策略,通过动态更新跟踪模板增强模型对变化目标的适应能力。在GTOT等3个基准数据集上的对比实验表明,所提方法可显著提升跟踪器在复杂环境下的目标跟踪性能。
Objective RGB and thermal infrared(RGBT)tracking technology fully leverages the complementary advantages of different optical modalities,providing effective solutions for target tracking challenges in complex environments.However,the performance of many tracking algorithms is constrained due to the neglect of information exchange between modalities.Simultaneously,as the tracking template remains fixed,existing tracking methods based on Siamese networks face limitations in adapting to variations in target appearance,resulting in tracking drift.Therefore,enhancing the performance of target trackers in complex environments remains challenging.Methods The proposed algorithm adopts the Siamese network tracker as its foundational framework and introduces a feature interaction module to enhance inter-modal information exchange by reconstructing information proportions of different modalities.Based on the anchor-free concept,a prediction network is directly constructed to perform classification and regression on the target bounding box at each position point in the search region.To address the mismatch between the target and template during the tracking of the Siamese network tracker,we propose a template update strategy,which dynamically updates the tracking template using the predicted results from the previous frame.Results and Discussions Qualitative and quantitative experiments are carried out on SiamCTU and advanced RGBT target tracking models,with ablation experiments analyzed.Meanwhile,comparative experiments are conducted by evaluating the proposed target tracker against state-of-the-art target trackers on three benchmark datasets(GTOT,RGBT234,and LasHeR)to assess the tracking performance of the algorithm.Figs.6,7,and 9 respectively display the quantitative comparison results between SiamCTU and advanced RGBT tracking algorithms on the three benchmark datasets.Compared with advanced RGBT target tracking algorithms,the experimental results on three baseline datasets demonstrate outstanding tracking performance of SiamCTU,fully exhibiting the effectiveness of the proposed method.Specifically,on the GTOT and LasHeR datasets,the proposed tracking algorithm secures top rankings in both PR and SR.Fig.8 and Table 1 respectively present the experimental results based on challenge attributes for the tracking algorithm on the GTOT and RGBT234 datasets.The experimental results show that SiamCTU exhibits excellent tracking performance under various challenging attributes,suggesting that the proposed tracker is effective in handling complex target tracking scenarios.To provide a more intuitive demonstration of the tracker s tracking performance,we visualize the tracking results in Fig.10.In the LightOcc sequence[Fig.10(a)],the proposed tracking algorithm utilizing the template update strategy maintains continuous and stable tracking of the target even under such challenges as occlusion and low illumination.For scenarios involving significant scale variations[Fig.10(b)],the proposed tracker outperforms the comparative tracker,demonstrating the advantages of constructing a prediction network based on the anchor-free concept.The visual results in Figs.10(c)and 10(d)reveal that the proposed tracker can leverage the complementary advantages of RGB and T modalities,reducing interference from similar objects.Meanwhile,the comparative tracking efficiency analysis of the tracker on the GTOT dataset(Table 2)indicates that SiamCTU significantly improves tracking accuracy with minimal tracking speed loss.Furthermore,the proposed tracker exhibits higher speed and precision advantages over the advanced MDNet-based tracker.In further ablation experiments(Table 3),the performance of the proposed tracker surpasses that of the baseline tracker,which underscores the substantial contributions of various modules designed in the algorithm and collectively enhances the tracker s ability to handle complex tracking scenarios.Specifically,when the feature interaction module is removed,the overall performance of SiamCTU decreases by 3.1%on the more complex RGBT234 dataset.Additionally,by varying template update parameters to study their influence on tracking performance,experimental results(Table 4)indicate that with an appropriate value ofλas the update parameter,the feature-level template update method can significantly enhance the tracker s performance.Conclusions To address the target tracking challenges in complex environments,we propose a cross-modal optical information interaction method for RGBT target tracking.The tracking model adopts the Siamese network as its foundational framework and incorporates a feature interaction module.This module enhances the inter-modal information exchange by reconstructing information proportions of different optical modalities,mitigating the effect of complex backgrounds on tracking performance.Subsequently,by dealing with the relationship between the tracker s initial template and the online template,we introduce a template dynamic updating strategy.This strategy dynamically updates the tracking template using predicted results,capturing the real-time status of the target and improving the algorithm s robustness.Evaluation results on three benchmark datasets including GTOT,RGBT234,and LasHeR demonstrate that the proposed method surpasses current advanced RGBT target tracking methods in terms of tracking accuracy.Additionally,it meets real-time tracking requirements and holds potential for broad applications in optical information detection,perception,and recognition of targets in complex environments.
作者
陈建明
李定鲣
曾祥津
任振波
邸江磊
秦玉文
Chen Jianming;Li Dingjian;Zeng Xiangjin;Ren Zhenbo;Di Jianglei;Qin Yuwen(Key Laboratory of Photonic Technology for Integrated Sensing and Communication,Ministry of Education,Guangdong Provincial Key Laboratory of Information Photonics Technology,School of Information Engineering of Guangdong University of Technology,Institute of Advanced Photonics Technology,Guangzhou 510006,Guangdong,China;Southern Marine Science and Engineering Guangdong Laboratory(Zhuhai),Zhuhai 519082,Guangdong,China;Key Laboratory of Light-Field Manipulation and Information Acquisition,Ministry of Industry and Information Technology,Shaanxi Key Laboratory of Photonics Technology for Information,School of Physical Science and Technology,Northwestern Polytechnical University,Xi’an 710129,Shaanxi,China)
出处
《光学学报》
EI
CAS
CSCD
北大核心
2024年第7期101-115,共15页
Acta Optica Sinica
基金
国家自然科学基金(62075183,62275218)
广东省“珠江人才计划”引进创新创业团队(2021ZT09X044,2019ZT08X340)
中央高校基本科研业务费专项资金(D5000230117)。
关键词
机器视觉
计算机视觉
目标跟踪
孪生网络
模板更新
machine vision
computer vision
object tracking
Siamese network
template update