期刊文献+

基于双路卷积局部对齐的文本行人跨模态检索 被引量:1

Text Pedestrian Cross-Modal Retrieval Based on Two-Way Convolutional Local Alignment
下载PDF
导出
摘要 文本行人跨模态检索是指通过对行人的语言描述在行人图像集中检索出对应身份的图像。针对文本特征判别性不足的问题,提出在文本支路结合BERT模型和Text-CNN网络来提升文本特征判别性的方法。该方法主要采用BERT模型作为词嵌入工具,并使用Text-CNN网络对文本特征进行进一步的特征提取。此外,除了使用全局匹配以外,该方法还考虑了局部特征对检索的影响。在CUHK-PEDES数据集上对所提方法进行了实验验证,实验结果证明了所提方法的有效性和优越性。 Text pedestrian cross-modal retrieval refers to retrieving images of corresponding identities from pedestrian image sets through language descriptions of pedestrians.Aiming at the problem of insufficient discriminativeness of text features,this paper proposes a method to improve the discriminativeness of text features by combining BERT model and Text-CNN network in the text branch.This method mainly uses BERT as a word embedding tool,and uses a Text-CNN network for further feature extraction of text features.Furthermore,in addition to using global matching,this paper also considers the impact of local features on retrieval.In this paper,the proposed method is experimentally verified on the CUHK-PEDES dataset,and the experimental results demonstrate the effectiveness and superiority of the proposed method.
作者 莫承见 MO Chengjian(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
出处 《电视技术》 2022年第4期25-30,共6页 Video Engineering
关键词 跨模态 特征提取 局部特征 cross-modality feature extraction local features
  • 相关文献

同被引文献12

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部