期刊文献+

基于CLIP与注意力机制的跨模态哈希检索算法

Cross modal hash retrieval algorithm based on CLIP and attention mechanism
下载PDF
导出
摘要 针对传统无监督跨模态检索算法提取样本内部与样本之间的关联语义不充分,导致检索准确率低的问题,提出一种基于CLIP与注意力融合机制的无监督跨模态哈希检索算法CAFM_Net。将多模态预训练模型CLIP运用到样本特征提取阶段,从不同维度挖掘数据的相似信息;使用注意力融合机制对提取的特征进行处理,加强显著区域的权重;引入对抗学习的思想设计模态分类器,生成更趋于语义一致性的跨模态数据哈希编码。与现有的代表性哈希方法相比,CAFM_Net在多模态检索任务上准确率提升至少11%与9%。 To address the issue of low retrieval accuracy caused by the inability of traditional unsupervised cross modal retrieval algorithms to fully extract the correlation semantics between samples,an unsupervised cross modal hash retrieval algorithm CAFM_Net based on CLIP and attention fusion mechanism was proposed.The multimodal pre-training model CLIP was applied to the sample feature extraction stage,mining similar information from different dimensions of the data.The attention fusion mechanism was used to process extracted features and enhance the weight of salient regions.The idea of adversarial learning was introduced to design a modal classifier,generating cross modal data hash encoding that tended towards semantic consistency.Compared with existing representative hash methods,CAFM_Net improves the accuracy by at least 11%and 9%on multimodal retrieval tasks.
作者 党张敏 喻崇仁 殷双飞 张宏娟 陕振 马连志 DANG Zhang-min;YU Chong-ren;YIN Shuang-fei;ZHANG Hong-juan;SHAN Zhen;MA Lian-zhi(Institute 706,Second Research Academy of China Aerospace Science and Industry Corporation,Beijing 100854,China;Military Representative Office,Second Research Academy of China Aerospace Science and Industry Corporation,Beijing 100854,China)
出处 《计算机工程与设计》 北大核心 2024年第3期852-858,共7页 Computer Engineering and Design
关键词 无监督哈希 跨模态检索 CLIP 注意力融合 对抗学习 深度学习 TRANSFORMER unsupervised hash cross modal retrieval CLIP attention fusion adversarial learning deep learning Transformer
  • 相关文献

参考文献6

二级参考文献11

共引文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部