期刊文献+

基于Transformer生成对抗网络的跨模态哈希检索算法

CROSS-MODAL HASH RETRIEVAL BASED ON TRANSFORMER GENERATIVE ADVERSARIAL NETWORKS
下载PDF
导出
摘要 考虑生成对抗网络在保持跨模态数据之间的流形结构的优势,并结合Transformer利用自注意力和无须使用卷积的优点,提出一种基于Transformer生成对抗网络的跨模态哈希检索算法。首先在ImageNet数据集上预训练Vision Transformer框架,并将其作为图像特征提取的主干网络,然后将不同模态的数据分割为共享特征和私有特征。接着,构建对抗学习模块减少不同模态的共享特征的分布距离与保持语义一致性,同时增大不同模态的私有特征分布距离与保持语义非一致性。最后将通用的特征表示映射为紧凑的哈希码,实现跨模态哈希检索。实验结果表明,在公共数据集上,所提算法优于对比算法。 Considering the advantages of Generative Adversarial Networks in maintaining manifold structure among cross-modal data,and combining the advantages of self-attention in Transformer and no need to use convolution,a cross-modal hash method based on Transformer Generative Adversarial Network is proposed.Firstly,the Vision Transformer framework is pre-trained on ImageNet dataset and used as the backbone network for image feature extraction.Then,different modalities are segmented into shared features and private features.Next,an adversarial learning module is constructed to align the distribution and semantic consistency of shared features of different modalities while increasing the distribution and semantic inconsistency of private features of different modalities.Finally,the general feature representation is mapped into a compact hash code to achieve cross-modal hash retrieval.Experimental results show that the proposed algorithm outperforms the comparison algorithms on public datasets.
作者 雷蕾 徐黎明 LEI Lei;XU Li-ming(School of Computer and Software,Nanyang Institute of Technology,Nanyang 473004,China;School of Computer Science,China West Normal University,Nanchong 637002,China)
出处 《南阳理工学院学报》 2024年第4期38-44,共7页 Journal of Nanyang Institute of Technology
基金 河南省科技攻关项目(212102210492) 南阳市科技攻关项目(KJGG102)。
关键词 TRANSFORMER 生成对抗网络 跨模态检索 哈希编码 语义保持 Transformer generative adversarial network cross-modal retrieval hash coding semantic preservation
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部