期刊文献+

基于表示学习的跨模态检索方法研究进展 被引量:2

Progress of Cross-modal Retrieval Methods Based on Representation Learning
下载PDF
导出
摘要 多模态数据的急剧增长带来了跨模态检索的应用需求,促进了对跨模态检索方法的研究。本文追溯该领域最新进展,跟踪并深入研究国内外基于表示学习的跨模态检索方法,对跨模态检索问题进行定义并梳理该领域常用技术方法、主流模型、常用数据集、评价方法和面临的主要挑战。主要从统计相关分析、图正则化和度量学习3方面介绍基于表示学习跨模态检索方法,并分析其优缺点。为了分析上述方法的优劣性,实验分别在4个数据集上复现14种方法进行对比评价。实验结果表明:基于统计相关分析方法训练效率较高且易于实施;基于图正则化方法通过挖掘模态内和模态间的相似性,实现语义关联;基于度量学习方法是在公共子空间中尽可能保留数据语义相似/不相似的信息。本文介绍基于表示学习的跨模态检索方法的研究现状,为跨模态检索方法研究提供参考。 With the rapid growth of multi-modal data,the application requirements of cross-modal retrieval are brought,and the research on cross-modal retrieval methods is proposed.This paper traces the latest progress in this field,tracks and deeply studies the cross-modal retrieval methods based on representation learning at home and abroad,defines the cross-modal retrieval problems,and combs the common technical methods,mainstream models,common data sets,evaluation methods and main challenges in this field.This paper mainly introduces the cross-modal retrieval method based on representation learning from three aspects:statistical correlation analysis,graph regularization and metric learning,and analyzes its advantages and disadvantages.In order to analyze the advantages and disadvantages of the above methods,14 methods are reproduced on four data sets for comparative evaluation.The experimental results show that the training method based on statistical correlation analysis is efficient and easy to implement;Based on graph regularization method,semantic association is realized by mining the similarity between and within modes;The metric-based learning method is to preserve the semantically similar/dissimilar information of data in the common subspace as much as possible.To sum up,this paper introduces the research status of cross-modal retrieval methods based on representation learning,which provided a reference for the research of cross-modal retrieval methods.
作者 杜锦丰 王海荣 梁焕 王栋 DU Jinfeng;WANG Hairong;LIANG Huan;WANG Dong(Department of Computer Science and Engineering,North Minzu University,Yinchuan Ningxia 750021,China)
出处 《广西师范大学学报(自然科学版)》 CAS 北大核心 2022年第3期1-12,共12页 Journal of Guangxi Normal University:Natural Science Edition
基金 宁夏自然科学基金(2020AAC03218) 宁夏省级培育项目(PY1906) 宁夏人才项目(KJT2019002)。
关键词 多模态数据 跨模态检索 统计相关分析 图正则化 度量学习 multi-modal data cross-modal retrieval statistical correlation analysis graph regularization metric learning
  • 相关文献

参考文献8

二级参考文献46

共引文献43

同被引文献24

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部