摘要
文搜图行人重识别旨在通过给定的文本从行人图库中检索目标人物,主要挑战的是学习对自由视角(姿势、照明和相机视点)的图像和自由形式的文本具备鲁棒特征。然而,由于在文本描述和行人图像中存在对行人属性挖掘的不足,在细粒度上因细节的差异从而影响了从文本描述到行人图像的检索性能。因此,研究提出了基于属性依存增强的文搜图行人重识别。首先,从文本描述解析出依存关系,并转化为依存矩阵。其次,设计了一个基于自注意力的属性干预模块来融合文本特征和依存矩阵,得到属性增强的文本特征。此时,文本特征经过干预,更为关注属性信息。最后,文本特征与图像特征参与训练,让整个网络对属性的挖掘更为敏感。在两个数据集CUHK-PEDES和ICFG-PEDES上进行实验,证明了模型的有效性。
Text-to-Image Person Reidentification(TIPR)aims to retrieve a target person from a pedestrian gal-lery with a given text,and its main challenge is to learn the robust features of free-view(posture,lighting and cam-era viewpoint)images and free-form texts.However,due to the lack of pedestrian attribute mining in text descrip-tions and pedestrian images,the retrieval performance from text descriptions to pedestrian images is affected by dif-ferences in details in fine granularity.Therefore,this study proposes TIPR based on Attribute Dependency Aug-mentation(ADA).Firstly,it analyzes dependencies from text descriptions and transforms them into dependency ma-trixes.Then,it designs an attribute intervention module based on self-attention to fuse text features and depen-dency matrixes and obtains attribute-augmented text features which are more concerned about attribute informa-tion after intervention.Finally,it allows text features and image features participate in training,making the whole network more sensitive to attribute mining.Experiments on two datasets CUHK-PEDES and ICFG-PEDES dem-onstrate the effectiveness of the proposed model.
作者
夏威
袁鑫攀
XIA Wei;YUAN Xinpan(Hunan University of Technology,Zhuzhou,Hunan Province,412000 China)
出处
《科技资讯》
2024年第8期12-15,共4页
Science & Technology Information
基金
2022年度湖南省自然科学基金项目“文本融合的无监督图像相似性度量学习研究”(项目编号:2022JJ30231)
2022年湖南省教育厅科学研究项目“面向视觉度量的复杂语义模态的消歧融合算法研究及应用”(项目编号:22B0559)。
关键词
文搜图行人重识别
自注意力机制
句法依存
自由视角
Text-to-Image Person Reidentification
Self-attention mechanism
Syntactic dependency
Free view