针对RGB(Red Green Blue)模态与热度模态信息表征形式不一致,特征信息无法有效挖掘、融合问题,提出了一种新的联合注意力强化网络-FCNet(Feature Sharpening and Cross-modal Feature Fusion Net)。首先,通过双维度注意力机制提升图像...针对RGB(Red Green Blue)模态与热度模态信息表征形式不一致,特征信息无法有效挖掘、融合问题,提出了一种新的联合注意力强化网络-FCNet(Feature Sharpening and Cross-modal Feature Fusion Net)。首先,通过双维度注意力机制提升图像特征映射能力;然后,利用跨模态特征融合机制捕获目标区域;最后,利用逐层解码结构消除背景干扰,优化检测目标。实验结果表明,该优化改进算法运算参数更少、运算时间更短,且模型整体检测性能均优于现有多模态检测模型性能。展开更多
Visual object tracking has been drawing increasing attention in recent years,as a fundamental task in computer vision.To extend the range of tracking applications,researchers have been introducing information from mul...Visual object tracking has been drawing increasing attention in recent years,as a fundamental task in computer vision.To extend the range of tracking applications,researchers have been introducing information from multiple modalities to handle specific scenes,with promising research prospects for emerging methods and benchmarks.To provide a thorough review of multi-modal tracking,different aspects of multi-modal tracking algorithms are summarized under a unified taxonomy,with specific focus on visibledepth(RGB-D)and visible-thermal(RGB-T)tracking.Subsequently,a detailed description of the related benchmarks and challenges is provided.Extensive experiments were conducted to analyze the effectiveness of trackers on five datasets:PTB,VOT19-RGBD,GTOT,RGBT234,and VOT19-RGBT.Finally,various future directions,including model design and dataset construction,are discussed from different perspectives for further research.展开更多
文摘针对RGB(Red Green Blue)模态与热度模态信息表征形式不一致,特征信息无法有效挖掘、融合问题,提出了一种新的联合注意力强化网络-FCNet(Feature Sharpening and Cross-modal Feature Fusion Net)。首先,通过双维度注意力机制提升图像特征映射能力;然后,利用跨模态特征融合机制捕获目标区域;最后,利用逐层解码结构消除背景干扰,优化检测目标。实验结果表明,该优化改进算法运算参数更少、运算时间更短,且模型整体检测性能均优于现有多模态检测模型性能。
基金supported in part by National Natural Science Foundation of China(Nos.U23A20384 and 62022021)in part by Joint Fund of Ministry of Education for Equipment Pre-research(No.8091B032155)+1 种基金in part by the National Defense Basic Scientific Research Program(No.WDZC20215250205)in part by Central Guidance on Local Science and Technology Development Fund of Liaoning Province(No.2022JH6/100100026).
文摘Visual object tracking has been drawing increasing attention in recent years,as a fundamental task in computer vision.To extend the range of tracking applications,researchers have been introducing information from multiple modalities to handle specific scenes,with promising research prospects for emerging methods and benchmarks.To provide a thorough review of multi-modal tracking,different aspects of multi-modal tracking algorithms are summarized under a unified taxonomy,with specific focus on visibledepth(RGB-D)and visible-thermal(RGB-T)tracking.Subsequently,a detailed description of the related benchmarks and challenges is provided.Extensive experiments were conducted to analyze the effectiveness of trackers on five datasets:PTB,VOT19-RGBD,GTOT,RGBT234,and VOT19-RGBT.Finally,various future directions,including model design and dataset construction,are discussed from different perspectives for further research.