基于预训练表示学习的端到端跨媒体检索方法

An End-to-End Cross-Media Retrieval Method Based on Pre-Trained Representation Learning

下载PDF

导出

摘要跨媒体检索是集成媒体数据表示学习与媒体数据信息对齐的检索方式,现有的跨媒体表示学习方法没有将单一媒体数据表示学习的先进方法集成应用,在跨媒体信息高层表示方面缺乏有效语义对齐。提出一种预训练表示学习的端到端跨媒体检索方法,该方法采用先进的预训练表示学习方法,分别利用残差网络(ResNet)和BERT模型抽取图像和文本高层表示特征,然后利用自注意力机制挖掘跨媒体数据的语义关联,实现跨媒体信息的语义对齐。以平均精度均值作为评价指标,在3个广泛使用的跨媒体数据集上验证了模型的有效性。实验表明,所提方法在3个数据集上的平均精度均值都优于其他几种对比方法。 Cross-media retrieval is a retrieval method of integrating media data representation learning and media data information alignment. The existing cross-media representation learning method does not integrate the state-of-the-art methods of single media data representation learning and lacks effective semantic alignment in the aspect of high level representation of transmedia information. In this paper, an end-to-end cross-media retrieval method based on pre-trained representation learning is proposed. This method adopts advanced representation learning method, uses residual network(ResNet) to extract image features and bidirectional encoder representation from transfromers(BERT) model to extract text features respectively. Self-attention mechanism is used to mine semantic associations of cross-media data to achieve semantic alignment of cross-media information. We take the mean average precision as the evaluation metric and verify the validity of the model on three widely used cross-media datasets. Experimental results show that the average accuracy of the proposed method is superior to other methods in all three data sets.

作者刘桐彤屈丹 LIU Tongtong;QU Dan(Information Engineering University,Zhengzhou 450001,China)

机构地区信息工程大学

出处《信息工程大学学报》 2022年第5期563-569,共7页 Journal of Information Engineering University

基金国家自然科学基金资助项目(62171470,61673395)。

关键词跨媒体检索表示学习 ResNet BERT 自注意力 cross-media retrieval representation learning ResNet BERT self-attention

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1支宇,王冰滢.基于符号修辞学的平面设计方法研究[J].包装工程,2021,42(6):226-231. 被引量：10
2胡航,牛晓伟,左昊,金重阳.基于改进HRNet架构的图像语义分割算法应用研究[J].现代计算机,2022,28(18):23-29. 被引量：1
3聂为之,王岩,杨嵩,刘安安,张勇东.基于循环生成对抗网络的跨媒体信息检索算法[J].计算机学报,2022,45(7):1529-1538. 被引量：13
4王旭.“跨媒介阅读交流”:阅读教学新模式的实践探索[J].中学语文,2022(23):79-81.
5于俊清,王鑫,况琨,刘偲,张新峰,宋子恺.跨媒体智能关联分析与语义理解理论与技术研究进展[J].计算机辅助设计与图形学学报,2023,35(1):1-22. 被引量：4
6李文昭,颜雄,陈志峰,隋常玲,王旭辉,王秋燕.基于CNKI核心期刊数据的土壤团聚体文献计量分析[J].遵义师范学院学报,2023,25(1):104-106. 被引量：1
7白鸽.“有X,就有Y”构式的倚函共变[J].汉语学报,2023(1):52-61. 被引量：1
8魏苏波,张顺香,朱广丽,孙争艳,李健.基于正交投影的BiLSTM-CNN情感特征抽取方法[J].南京师大学报（自然科学版）,2023,46(1):139-148. 被引量：3
9杨绍琼,李元昊,孙通帅,杨亚楠,杨明,王延辉.“海燕”号谱系化水下滑翔机技术发展与应用[J].水下无人系统学报,2023,31(1):68-85. 被引量：5
10于洁潇,张大壮,何凯.基于错误纠正模块的场景文本识别算法[J].天津大学学报（自然科学与工程技术版）,2023,56(4):400-407.

信息工程大学学报

2022年第5期

浏览历史

内容加载中请稍等...

基于预训练表示学习的端到端跨媒体检索方法

相关作者

相关机构

相关主题

浏览历史