摘要
红外线和可见光域之间的跨模态人员再识别对于夜间监控应用极为重要。一方面,除了由不同摄像机光谱引起的跨模态差异外,可见光红外人物再识别还受到不同摄像机视角和人物姿态引起的巨大跨模态和模态内变化的影响;另一方面,现有的可见光⁃红外行人重识别方法倾向于学习全局表示,辨别力有限,对噪声图像的鲁棒性较弱。文中通过挖掘可见光⁃红外行人重识别的模态内层次和跨模态图级上下文线索,提出了一种新型的三注意力聚合学习方法。文中提出了一个模态内局部注意力加权的模块,通过对通道和局部关系挖掘施加领域知识来提取判别性的局部聚合特征。为了增强对噪声样本的鲁棒性,引入了改进的三元组损失并结合中心损失,考虑到离样本最近的不同类之间的距离,使得不同类之间可以保持一定的距离并提高特征的区分度。广泛的实验表明,三注意力聚合网络在各种环境下的表现都优于最先进的方法。
A cross⁃modal person re⁃identification between the infrared and visible domains is extremely important for nighttime surveillance applications.In addition to the cross⁃modal differences caused by different camera spectra,a visible infrared person re⁃identification is also affected by the large cross⁃modal and intra⁃modal variations caused by different camera views and person poses.On the other hand,existing VI⁃ReID methods tend to learn global representations with limited discriminative power and weak robustness to noisy images.A novel three⁃attentional aggregation network(TAANet)learning method is proposed by mining intra⁃modal hierarchical and cross⁃modal graph⁃level contextual cues of VI⁃ReID.A module for intra⁃modal local attention weighting is used to extract discriminative local aggregation features by imposing domain knowledge on channel and local relationship mining.To enhance robustness to noisy samples,an improved triple loss combined with a center loss is introduced considering the distance between the different classes closest to the sample,allowing a certain distance to be maintained between classes and improving the discrimination of features.Experimental results show that TAANet outperforms state⁃of⁃the⁃art methods in a variety of settings.
作者
黄盼
朱松豪
梁志伟
HUANG Pan;ZHU Songhao;LIANG Zhiwei(College of Automation&College of Artificial Intelligence,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
出处
《南京邮电大学学报(自然科学版)》
北大核心
2021年第5期101-112,共12页
Journal of Nanjing University of Posts and Telecommunications:Natural Science Edition
基金
南京邮电大学自然科学基金(NY219107)资助项目。
关键词
行人重新识别
局部特征
图形结构化注意力
person ReID identification
local feature
graphically structured attention