结合时空距离的多网络互学习行人重识别

Spatiotemporal distance and multiple networks mutual learning-relevant pedestrian re-identification

导出

摘要目的在真实行人识别场景中,获得准确的标注需要耗费大量人力,因此无监督领域自适应成为行人重识别具有潜力的研究方向,这类方法通常需要聚类生成伪标签,往往会存在噪音。此外,在行人搜索过程中,好的排序算法也是取得更好识别性能的关键,但寻常的Re-Ranking排序优化由于巨大的性能消耗,限制了在真实场景下的应用。针对这两个问题,本文提出了一个联合多网络、分摄像头训练的框架,利用时空信息对排序进行优化。方法对源域数据使用有监督进行预训练,利用未标记的目标域样本进行多个网络模型的深度互学习无监督训练,提高网络的泛化能力,同时在训练过程中进行分摄像头处理,减小跨摄像头的影响,提升伪标签的质量。在排序匹配阶段利用时空信息对排序进行优化,进一步提升匹配性能。结果实验在2个跨域实验数据集上进行测试比较,在源域为DukeMTMC-ReID(Duke multi-tracking multi-camera re-identification)数据集,目标域为Market-1501数据集的实验中,本文方法的平均精度均值(mean average precision,mAP)和Rank1分别为82.5%和95.3%;在源域为Market-1501,目标域为DukeMTMC-ReID数据集的实验中,mAP和Rank1分别为75.3%和90.2%。结论提出的结合时空距离排序的分摄像头网络互学习模型,提升了伪标签的精度,并优化了匹配排序,相比于其他优化算法大幅减少了计算量,进一步提升了行人重识别性能。 Objective Pedestrian re-identification can be focused on real-time target detection and matching.Due to laborintensive to annotate accurate labels,unsupervised domain adaptation has become a potential solution.To generate pseudo labels,this method is required for clustering accompany with noise.Experimental analysis is demonstrated that camera-cross is one of the key distorted factors for noise.Current eigenvector method is oriented to weaken the cross-domain and it is challenged for identifying camera ID-based information effectively.Hence,we design a camera module to resolve the problem of camera-cross.In addition,a single network is often used to extract features.Experimental analysis illustrate the single feature extraction ability of a single backbone network would also have more effective impact on the final performance.Therefore,learning-mutual is used to optimize the single network.For pedestrian searching,a good ranking algorithm is beneficial for a better recognition performance.We optimize traditional re-ranking algorithm using spatio-temporal information in the dataset because regular re-ranking optimization limits its application in real scenarios due to huge performance consumption.The time and space consumption close to the original ranking can reach the traditional re-ranking effect.To optimize the ranking,we develop a joint of spatio-temporal information-relevant multi-network and camerasplitting training framework.Method First,to improve the initial recognition performance,the network is pre-trained on the source domain dataset,and two of loss functions in relevance with label smoothing cross-entropy and triplet are used to pre-train the source domain.Second,due to the unique features extracted from a single backbone network,the single network model cannot be used to preserve good generalization ability in the ever-changing real scenarios.Therefore,we design a learning-mutual model to enhance its robustness.The pedestrian re-identification-oriented camera-split strategy is implemented to deal with recognition interference derived from cross-camera.For the pseudo-label generation,the dataset is split according to the camera ID,and the output vector is averaged after different networks-toward input.Additionally,we make full use of spatial information to optimize the pedestrian re-identification algorithm in another dimension because prior recognition analyses are originated from the distribution factors of pedestrians.For example,since the same camera is relatively close under the same timestamp,we use the timestamp information in the image.The one-hot-coded time stamps are spliced into feature vectors and it is then clustered to obtain pseudo labels.For training,to transfer knowledge from one network model to another,we use the class prediction of each network model as a soft label for training other related network models.For learning-mutual module,a time-averaging model is added,which can be updated iteratively during the training process.To suppress error-amplified,a large amount of prior information can be preserved.Furthermore,the learningmutual correlation loss function is designed as well.Traditional classification loss and triplet loss are modified,and the loss function is designed on the basis of the integration of pseudo-labels and multiple backbone networks-related features.The network model training-based feature distribution can be constrained by multiple network models at the same time.For features-sorting,to optimize the traditional sort algorithm,pedestrian re-identification characteristics and spatiotemporal information of the dataset can be used according to the cameras of the same pseudo label number.The distribution of timestamp and statistics is used to generate the time distribution between different cameras,and a spatiotemporal score of camera is defined to fine-tune distance-between characteristics.This method is focused on a re-ranking spatially and the efficient and effective method can achieve similar spatio-temporal results close to the original ranking.Result The comparative analysis is carried out and popular 10 methods are compared on two cross-domain experimental datasets.For source domainrelevant Duke multi-tracking multi-camera reidentification(DukeMTMC-ReID)data set and target domain-related market-1501 dataset,the mean average precision(mAP)value can be reached to 82.5%,and the Rank1 is increased by 3.1%and reached to 95.3%.For the dataset in relevant to source domain market-1501 and target domain DukeMTMC-ReID,mAP and Rank1 can be reached to 75.3%and 90.2%of each.Conclusion To improve the accuracy of pseudo labels and optimize the matching ranking,the spatiotemporal distance ranking-coordinated learning-mutual model is developed in subcamera network.Its computation is optimized more and pedestrian re-recognition performance is improved further.

作者李宽龚勋樊剑锋 Li Kuan;Gong Xun;Fan Jianfeng(Graduate School of Tangshan,Southwest Jiaotong University,Tangshan 063000,China;School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China;Engineering Research Center of Sustainable Urban Intelligent Transportation,Chengdu 611756,China;Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province,Chengdu 610031,China)

机构地区西南交通大学唐山研究生院西南交通大学计算机与人工智能学院可持续城市交通智能化教育部工程研究中心四川省制造业产业链协同与信息化支撑技术重点实验室

出处《中国图象图形学报》 CSCD 北大核心 2023年第5期1409-1421,共13页 Journal of Image and Graphics

基金国家自然科学基金项目(61876158) 四川省重点研发项目(2023YFG0267) 中央高校基本科研业务费科技创新项目(2682021ZTPY030,2682022KJ045)。

关键词行人重识别互学习分摄像头跨域时空距离 pedstrian re-ID mutual learning multiple cameras cross domain time and space distance

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献2

1耿伟峰,王翔,景丽萍,于剑.共识图学习驱动的自监督集成聚类[J].中国图象图形学报,2023,28(4):1069-1078. 被引量：1
2龚勋,张志莹,刘璐,马冰,吴昆伦.人物交互检测研究进展综述[J].西南交通大学学报,2022,57(4):693-704. 被引量：4

二级参考文献1

1Hanchao Liu,Tai-Jiang Mu,Xiaolei Huan.Detecting human-object interaction with multi-level pairwise feature network[J].Computational Visual Media,2021,7(2):229-239. 被引量：3

共引文献3

1刘沅畅,钱秋林,钟淼.深度卷积神经网络在网络哑资源管理上的应用[J].通信与信息技术,2022(S01):81-84. 被引量：1
2张润江,郭杰龙,俞辉,兰海,王希豪,魏宪.面向多姿态点云目标的在线类增量学习[J].液晶与显示,2023,38(11):1542-1553. 被引量：1
3曾文献,李岳松.面向人体姿态图像关键点检测的深度学习算法[J].计算机仿真,2024,41(5):209-213.

1李心怡,石旭,李辉,姚世严,李天宇,郑剑飞.基于行人重识别(ReID)技术的乘客出行特征研究[J].数字通信世界,2023(1):55-57.
2赵彦如,牛东杰,杨蕙萌.基于注意力机制和姿态识别的行人再识别[J].河南理工大学学报（自然科学版）,2023,42(2):120-126.
3李枘,蒋敏.基于姿态估计与特征相似度的行人重识别算法[J].激光与光电子学进展,2023,60(6):27-35. 被引量：3
4冉建双,杨海波.改进的线性无线传感网络节点排序算法[J].科学技术创新,2023(14):93-96.
5张红颖,王徐泳,彭晓雯.结合前景分割的多特征融合行人重识别[J].中国图象图形学报,2023,28(5):1360-1371.
6李书涵,周学良,冷杰武.融合帝国竞争与遗传算法的零件加工工艺排序方法[J].制造技术与机床,2023(1):114-120. 被引量：3
7黄奎,何姗姗,刘海秋.基于SCR脱硝催化剂质量评判下的排序优化及效益分析[J].环境工程技术学报,2023,13(2):534-540. 被引量：1
8钱亚萍,王凤随,熊磊.基于局部细化多分支与全局特征共享的无监督行人重识别方法[J].电子测量与仪器学报,2023,37(1):106-115. 被引量：4
9王才雪,陈坚,万宇,苗国厚,刘柯良,易彤.成渝地区双城经济圈交通与经济发展空间格局及耦合协调水平研究[J].交通运输研究,2023,9(2):82-90. 被引量：1
10谭鹰.基于深度神经网络的数字化车间作业排序优化[J].中国机械,2023(3):102-106.

中国图象图形学报

2023年第5期

浏览历史

内容加载中请稍等...

结合时空距离的多网络互学习行人重识别

参考文献2

二级参考文献1

共引文献3

相关作者

相关机构

相关主题

浏览历史