期刊文献+

基于知识蒸馏的深度无监督离散跨模态哈希 被引量:2

Deep unsupervised discrete cross-modal hashing based on knowledge distillation
下载PDF
导出
摘要 跨模态哈希因其低存储花费和高检索效率得到了广泛的关注。现有的大部分跨模态哈希方法需要额外的手工标签来提供实例间的关联信息,然而,预训练好的深度无监督跨模态哈希方法学习到的深度特征同样能提供相似信息;且哈希码学习过程中放松了离散约束,造成较大的量化损失。针对以上两个问题,提出基于知识蒸馏的深度无监督离散跨模态哈希(DUDCH)方法。首先,结合知识蒸馏中知识迁移的思想,利用预训练无监督老师模型潜藏的关联信息以重构对称相似度矩阵,从而代替手工标签帮助有监督学生模型训练;其次,采用离散循环坐标下降法(DCC)迭代更新离散哈希码,以此减少神经网络学习到的实值哈希码与离散哈希码间的量化损失;最后,采用端到端神经网络作为老师模型,构建非对称神经网络作为学生模型,从而降低组合模型的时间复杂度。在两个常用的基准数据集MIRFLICKR-25K和NUS-WIDE上的实验结果表明,该方法相较于深度联合语义重构哈希(DJSRH)方法在图像检索文本/文本检索图像两个任务上的平均精度均值(mAP)分别平均提升了2.83个百分点/0.70个百分点和6.53个百分点/3.95个百分点,充分体现了其在大规模跨模态数据检索中的有效性。 Cross-modal hashing has attracted much attention due to its low storage cost and high retrieval efficiency.Most of the existing cross-modal hashing methods require the inter-instance association information provided by additional manual labels.However,the deep features learned by pre-trained deep unsupervised cross-modal hashing methods can also provide similar information.In addition,the discrete constraints are relaxed in the learning process of Hash codes,resulting in a large quantization loss.To solve the above two issues,a Deep Unsupervised Discrete Cross-modal Hashing(DUDCH)method based on knowledge distillation was proposed.Firstly,combined with the idea of knowledge transfer in knowledge distillation,the latent association information of the pre-trained unsupervised teacher model was used to reconstruct the symmetric similarity matrix,so as to replace the manual labels to help the supervised student method training.Secondly,the Discrete Cyclic Coordinate descent(DCC)was adopted to update the discrete Hash codes iteratively,thereby reducing the quantization loss between the real-value Hash codes learned by neural network and the discrete Hash codes.Finally,with the end-to-end neural network adopted as teacher model and the asymmetric neural network constructed as student model,the time complexity of the combination model was reduced.Experimental results on two commonly used benchmark datasets MIRFLICKR-25K and NUS-WIDE show that compared with Deep Joint-Semantics Reconstructing Hashing(DJSRH),the proposed method has the mean Average Precision(mAP)in image-to-text/text-to-image tasks increased by 2.83 percentage points/0.70 percentage points and 6.53 percentage points/3.95 percentage points averagely and respectively,proving its effectiveness in large-scale cross-modal retrieval.
作者 张成 万源 强浩鹏 ZHANG Cheng;WAN Yuan;QIANG Haopeng(School of Science,Wuhan University of Technology,Wuhan Hubei 430070,China)
出处 《计算机应用》 CSCD 北大核心 2021年第9期2523-2531,共9页 journal of Computer Applications
基金 中央高校基本科研业务费专项资金资助项目(2019IB010)。
关键词 跨模态哈希 知识蒸馏 相似度矩阵重构 离散循环坐标下降法 非对称 cross-modal hashing knowledge distillation reconstruction of similarity matrix Discrete Cyclic Coordinate descent(DCC) asymmetric
  • 相关文献

参考文献3

二级参考文献4

共引文献26

同被引文献3

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部