基于Transformer生成对抗网络的跨模态哈希检索算法

CROSS-MODAL HASH RETRIEVAL BASED ON TRANSFORMER GENERATIVE ADVERSARIAL NETWORKS

下载PDF

导出

摘要考虑生成对抗网络在保持跨模态数据之间的流形结构的优势,并结合Transformer利用自注意力和无须使用卷积的优点,提出一种基于Transformer生成对抗网络的跨模态哈希检索算法。首先在ImageNet数据集上预训练Vision Transformer框架,并将其作为图像特征提取的主干网络,然后将不同模态的数据分割为共享特征和私有特征。接着,构建对抗学习模块减少不同模态的共享特征的分布距离与保持语义一致性,同时增大不同模态的私有特征分布距离与保持语义非一致性。最后将通用的特征表示映射为紧凑的哈希码,实现跨模态哈希检索。实验结果表明,在公共数据集上,所提算法优于对比算法。 Considering the advantages of Generative Adversarial Networks in maintaining manifold structure among cross-modal data,and combining the advantages of self-attention in Transformer and no need to use convolution,a cross-modal hash method based on Transformer Generative Adversarial Network is proposed.Firstly,the Vision Transformer framework is pre-trained on ImageNet dataset and used as the backbone network for image feature extraction.Then,different modalities are segmented into shared features and private features.Next,an adversarial learning module is constructed to align the distribution and semantic consistency of shared features of different modalities while increasing the distribution and semantic inconsistency of private features of different modalities.Finally,the general feature representation is mapped into a compact hash code to achieve cross-modal hash retrieval.Experimental results show that the proposed algorithm outperforms the comparison algorithms on public datasets.

作者雷蕾徐黎明 LEI Lei;XU Li-ming(School of Computer and Software,Nanyang Institute of Technology,Nanyang 473004,China;School of Computer Science,China West Normal University,Nanchong 637002,China)

机构地区南阳理工学院计算机与软件学院西华师范大学计算机学院

出处《南阳理工学院学报》 2024年第4期38-44,共7页 Journal of Nanyang Institute of Technology

基金河南省科技攻关项目(212102210492) 南阳市科技攻关项目(KJGG102)。

关键词 TRANSFORMER 生成对抗网络跨模态检索哈希编码语义保持 Transformer generative adversarial network cross-modal retrieval hash coding semantic preservation

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1胡美琪.基于时空特征相关性建模的高光谱遥感影像变化检测方法研究[J].武汉大学学报（信息科学版）,2024,49(6):1051-1051.
2代欣.优化档案利用效率的方法[J].葡萄酒,2024(12):0154-0156.
3李宁,程旭,卢景才,梁河雷,时洪刚.基于改进YOLOv8的输电线路故障识别方法[J].河北电力技术,2024,43(4):56-63.
4袁嘉梦,陈浪,陈维亚,骆汉宾.历史建筑多模态检索方法研究[J].土木建筑工程信息技术,2024,16(4):7-13.
5朱王令,金正猛,王皓.融合注意力机制和边缘预测的医学图像分割网络算法[J].南京邮电大学学报（自然科学版）,2024,44(4):77-87.
6赵小明,范慧婷,张石清.一种基于多模态特征增强网络的抑郁症检测方法[J].软件工程,2024,27(10):68-73.
7韩普,叶东宇.面向在线健康社区的生成式方面级情感分析[J].现代情报,2024,44(10):142-153.
8张玉蕾,姚旭峰,吴韬.影像学预测脑龄研究进展[J].中国介入影像与治疗学,2024,21(9):561-564.
9周孔均,常涛,刘维,吕小红.基于注意机制LSTM-CNN的准周期时间序列异常检测框架[J].计算机应用与软件,2024,41(9):319-328.
10梁礼明,陈康泉,王成斌,冯耀,龙鹏威.融合视觉中心机制和并行补丁感知的遥感图像检测算法[J].光电工程,2024,51(7):72-83.

南阳理工学院学报

2024年第4期

浏览历史

内容加载中请稍等...

基于Transformer生成对抗网络的跨模态哈希检索算法

相关作者

相关机构

相关主题

浏览历史