期刊文献+

基于多尺度特征增强与对齐的跨模态行人检索

Cross-modal pedestrian retrieval based on multi-scale feature enhancement and alignment
下载PDF
导出
摘要 为了解决跨模态行人检索从图像和文本中抽取有效的细节特征,以及实现图像与自然语言文本跨模态对齐的问题,提出一种基于多尺度特征增强与对齐的跨模态行人检索模型。该模型引入多模态预训练模型,并构建文本引导的图像掩码建模辅助任务,充分实现跨模态交互,从而无需显式地标注信息即可增强模型学习图像局部细节特征的能力。另外,针对行人图像身份易混淆问题,设计全局图像特征匹配辅助任务,引导模型学习身份关注的视觉特征。在CUHK-PEDES、ICFG-PEDES和RSTPReid等多个公开数据集上的实验结果表明,所提模型超越了目前已有的主流模型,其第一命中率分别达到了72.47%、62.71%和59.25%,实现了高准确率的跨模态行人检索。 In order to solve the problem of extracting effective detail features from images and texts in cross-modal pedestrian retrieval,as well as achieving cross-modal alignment between images and natural language texts,a cross-modal pedestrian retrieval model based on multi-scale feature enhancement and alignment is proposed.In this model,the multimodal pre-training model is introduced,and the text-guided image mask modeling auxiliary task is constructed to fully realize cross-modal interaction,so as to enhance the model's ability to learn local image detail features without explicit annotation information.In allusion to the identity confusion in person images,a global image feature matching auxiliary task is designed to guide the model to learn visual features that are relevant to identity.The experimental results on multiple public datasets such as CUHK-PEDES,ICFG-PEDES,and RSTPReid show that the proposed model surpasses existing mainstream models,with first hit rates of 72.47%,62.71%,and 59.25%,respectively,achieving high accuracy in cross-modal pedestrian retrieval.
作者 徐领 缪翌 张卫锋 XU Ling;MIAO Yi;ZHANG Weifeng(School of Computer Science and Technology,Zhejiang Sci‐Tech University,Hangzhou 310018,China;School of Information Science and Engineering,Jiaxing University,Jiaxing 314001,China)
出处 《现代电子技术》 北大核心 2024年第22期44-50,共7页 Modern Electronics Technique
关键词 跨模态行人检索 多尺度特征增强 多模态对齐 CLIP 图像掩码 跨模态交互 交叉注意力 cross modal pedestrian retrieval multi-scale feature enhancement multimodal alignment CLIP image mask cross-modal interaction cross attention
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部