渐进式迭代优化的行人属性识别

Pedestrian attribute recognition method based onthe progressive iterative optimization

导出

摘要目的现阶段行人属性识别任务存在的主要问题在于某些属性类别的样本分布严重不均衡,为了解决上述问题,提出了一种基于渐进式迭代优化的行人属性识别方法。方法首先针对不均衡类别,采用马赛克自编码器进行数据增广,构建基于属性平衡化的数据生成模型(balanced attributes-data generation model,BA-DGM),实现从通用大模型到专用小任务的迁移学习和知识增强;然后针对新生成的样本数据,采用判别模型进行一致性筛选,在与生成模型的相互对抗中实现启发式的注意力机制,从而构建基于特征注意力的数据判别模型(attention features-data discrimination model,AF-DDM);最后通过数据生成与数据判别相互交替的循环迭代,实现行人属性识别模型和数据的渐进式优化,并针对均衡后的样本数据,采用知识蒸馏框架对不同轮次的判别模型进行融合,实现基于渐进式迭代的蒸馏融合模型(progressive iterations-distillation fusion model,PI-DFM),进一步提高模型的泛化能力。结果实验结果表明,所提出的渐进式优化方法在4个当前主流的评测数据集上均能有效提升模型准确率。在RAPv2(richly annotated pedestrian v2)数据集上,在模型复杂度不变的情况下,与已公开的最优行人属性识别模型相比,平均准确率(mean accuracy,mA)和平均F1分数分别提升了约5.0%和约1.7%;同时,经过多轮循环迭代后,原始数据中不均衡类别的个数减少为0,从而实现了数据集的渐进式优化。结论本文提出的渐进式迭代优化策略与现有的改进方法之间具有良好的互补性,并有助于进一步提升模型的准确性指标。 Objective The pedestrian attribute recognition task is currently challenged for the sample distribution issue of some severe unbalanced attribute categories.To resolve the problems,we develop a method of progressive iteration optimization for pedestrian attribute recognition.Method First,data generation model based on masked autoencoder is used for data extension of the unbalanced categories distribution,and general large model-derived can be oriented to the small task.The balanced attributes-data generation model(BA-DGM)relevant masked autoencoder can be utilized to mask the original pedestrian images in terms of a random masking ratio and such newly generated images can be obtained for small-amount categories.The potential information can be fully mined,such as the topological relationship of the visible area,and the latent features-derived pedestrian images can be more resilient.Furthermore,it demonstrates that the autoencoder model can effectively achieve the universal feature representation of the targeted pedestrian,including the consensus features like the relationship-interconnected between various key components of the pedestrian.Second,discrimination model is used for filtering-consistent for the newly generated sample data,and the heuristic attention mechanism is adopted and implemented to deal with generative adversarial networks(GANs).The newly attention features-data discrimination model(AF-DDM)can be utilized and the diversified sample can be achieved while the key features of the attributes are preserved,which can enhance the interpretability of the recognition model.At the same time,to learn effective featuresrelated attributes,the filtered data is generated for training model.In the training process of the discrimination model,50-layer residual network model is adopted as the backbone network to be trained on the original attribute recognition dataset,using a multi-label classification framework.And,in the reasoning process of the discrimination model,the whole attribute labels are divided into two categories:key attribute labels and other related attribute labels.For key attribute labels,to keep consistent with the original labels and preserve the relevant high confidence,the newly generated sample can be kept in consistency in terms of the predicted labels from discrimination model,but it cannot be vice versed.Finally,the pedestrian attribute recognition model and data-contextual can be optimized further based on the cyclical iteration of data generation and discrimination.To optimize generalization ability of the model,the knowledge distillation framework can be used to fuse the discrimination models of the balanced sample data as well.After multiple iterations,the progressive iterationsdistillation fusion model(PI-DFM)based attribute discrimination models can be used as the teacher models and category balancing-afterward attribute recognition dataset is used as the training data.The above models are mutual-benefited in accordance with the datasets of different sample proportions.The network structure of the student model is consistent with the teacher model and the Kullback-Leibler(KL)divergence between the student output and the teacher output is calculated as the distillation loss function.In large-scale practical application scenarios,the sample proportion of test data and train data might be different.To improve the generalization ability of the model in an open uncertain scenario,teacher model can be trained by integrating different sample-proportion data in terms of the knowledge distillation framework.Result Experimental results are demonstrated that the proposed optimization method can effectively improve the accuracy of the model on the four popular evaluation datasets.The proposed metrics for attributes and samples are calculated,including 1)the mean accuracy of all attributes and 2)the F1 score of all samples,representing the harmonic average of the mean accuracy and the mean recall.For example,in the richly annotated pedestrian v2(RAPv2)dataset,the mean accuracy is increased by about 5.0%and the average F1 score is increased by about 1.7%as well on the hypothesis of an unchanged model complexity.After several loops of cyclic iteration,the number of unbalanced categories in the original data is reduced to zero,and the optimization can be thus realized for the dataset.In the ablation studies,new samples are randomly generated for each positive sample image,and then the discrimination model is used to filter inconsistent samples.The probability of spatial distribution of the preserved details is analyzed experimentally in terms of the masked region analysis of the filtered samples.The heuristic attention mechanism is introduced and data discrimination model can retain the relevant features of the key attributes of the targeted pedestrian better,which demonstrates that the interpretability of the discrimination model can be further improved by deeply mining the distribution of related features for different attributes.Conclusion The progressive iterative optimization strategy proposed in this paper has good complementarity with the existing improvement methods,and is helpful to further improve the accuracy of the recognition model.To optimize the relationship modeling among multiple pedestrian attributes and improve the interpretability of the recognition model further,future research direction can be predicted and focused on universal feature representation-based masked autoencoder(MAE)model combined with such prior knowledge like human skeleton structure.

作者丁正彦尚岩峰张重阳 Ding Zhengyan;Shang Yanfeng;Zhang Chongyang(Research Center on Internet of Things,the Third Research Institute of the Ministry of Public Security,Shanghai 201204,China;School of Electronic Information and Electrical Engineering,Shanghai Jiao Tong University,Shanghai 200240,China)

机构地区公安部第三研究所物联网技术研发中心上海交通大学电子信息与电气工程学院

出处《中国图象图形学报》 CSCD 北大核心 2023年第5期1487-1498,共12页 Journal of Image and Graphics

基金上海市青年科技英才扬帆计划资助项目(20YF1409300)。

关键词行人属性识别样本不均衡渐进式迭代马赛克自编码器迁移学习一致性筛选知识蒸馏 pedestrian attribute recognition unbalanced sample progressive iteration masked autoencoder transfer learning consistency filtering knowledge distillation

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1罗艳,张重阳,田永鸿,郭捷,孙军.深度学习行人检测方法综述[J].中国图象图形学报,2022,27(7):2094-2111. 被引量：17

共引文献16

1郭志坚,李江勇,祁海军,赵金博.基于改进YOLOv4的红外行人车辆检测算法[J].激光与红外,2023,53(4):607-614. 被引量：3
2娄翔飞,吕文涛,叶冬,郭庆,鲁竞,陈影柔.基于计算机视觉的行人检测方法研究进展[J].浙江理工大学学报（自然科学版）,2023,49(3):318-330. 被引量：4
3高强,唐福兴,李栋,吉月辉,刘俊杰,史涛,苏艳杰.基于改进YOLOv5的密集场景行人检测方法研究[J].国外电子测量技术,2023,42(4):125-130. 被引量：7
4张宏扬.基于深度学习的遮挡行人检测研究[J].信息技术与信息化,2023(6):217-220. 被引量：1
5郝帅,杨晨禄,赵秋林,马旭,孙曦子,王海莹,孙浩博,吴瑛琦.基于双分支头部解耦和注意力机制的灾害环境人体检测[J].西安科技大学学报,2023,43(4):797-806. 被引量：1
6张阳,张帅锋,刘伟铭.融合残差网络和特征金字塔的小尺度行人检测方法[J].交通信息与安全,2023,41(3):111-118.
7卢嫚,刘秀平,冯国栋.基于YOLOv5融合注意力机制的轻量级行人检测算法研究[J].国外电子测量技术,2023,42(8):96-101. 被引量：2
8朱锦雷,李艳凤,陈后金,孙嘉,潘盼.近邻优化跨域无监督行人重识别算法[J].中国图象图形学报,2023,28(11):3471-3484.
9刘嘉泽,王超,生龙.基于YOLOv5的行人检测方法研究[J].电脑与信息技术,2024,32(1):37-41. 被引量：1
10章博闻.基于深度学习的有锚框行人检测方法综述[J].传感器世界,2024,30(1):7-12.

1钱晓燕.浅谈解决初中数学题的方法与策略[J].中文科技期刊数据库（全文版）教育科学,2021(8):306-306.
2史小强,黄钢,苏可怡.二维人体关键点检测算法综述[J].软件工程,2023,26(6):6-10. 被引量：2
3千纸鹤.穿越核幕苏俄坦克的核环境应对系统的改进和升级[J].海陆空天惯性世界,2023(5):36-46.
4刘晓倩,宋佳伟,范秀英,王牛牛,武亚光.基于属性层次模型的替加环素合理性评价[J].实用药物与临床,2023,26(5):401-406.
5Xiaoyan Jiang,Zuojin Hu,Shuihua Wang,Yudong Zhang.A Survey on Artificial Intelligence in Posture Recognition[J].Computer Modeling in Engineering & Sciences,2023(10):35-82. 被引量：3
6心跳快慢由什么控制[J].秋光（长寿生活）,2022(11):19-19.
7杨巍,牛蒙蒙,白玉珍,单春海,卢伟国,吕世旭.基于cGAN的刀具磨损状态监测数据集增强方法[J].制造技术与机床,2023(6):55-60. 被引量：1
8刘威,王薪予,魏宪,郭直清,靳宝,牛英杰,马灵潇,赵保钦.基于自适应图的半监督图像分类方法[J].辽宁工程技术大学学报（自然科学版）,2023(1):119-128.
9Stuart Dereck Semujju,Han Huang,Fangqing Liu,Yi Xiang,Zhifeng Hao.Search-Based Software Test Data Generation for Path Coverage Based on a Feedback-Directed Mechanism[J].Complex System Modeling and Simulation,2023,3(1):12-31.
10黄子涵,南乐艳,方欣妍,蔡扬扬.小学校本课程研发的风险问题归因及其改进措施——以浙江省内100所小学为例[J].教育科学论坛,2023(10):37-42.

中国图象图形学报

2023年第5期

浏览历史

内容加载中请稍等...

渐进式迭代优化的行人属性识别

参考文献1

共引文献16

相关作者

相关机构

相关主题

浏览历史