期刊文献⁺

任意字段

题名或关键词

题名

关键词

文摘

作者

第一作者

机构

刊名

分类号

参考文献

作者简介

基金资助

栏目信息

VLP:A Survey on Vision-language Pre-training 被引量：3

原文传递

导出

摘要 In the past few years,the emergence of pre-training models has brought uni-modal fields such as computer vision(CV)and natural language processing(NLP)to a new era.Substantial works have shown that they are beneficial for downstream uni-modal tasks and avoid training a new model from scratch.So can such pre-trained models be applied to multi-modal tasks?Researchers have ex-plored this problem and made significant progress.This paper surveys recent advances and new frontiers in vision-language pre-training(VLP),including image-text and video-text pre-training.To give readers a better overall grasp of VLP,we first review its recent ad-vances in five aspects:feature extraction,model architecture,pre-training objectives,pre-training datasets,and downstream tasks.Then,we summarize the specific VLP models in detail.Finally,we discuss the new frontiers in VLP.To the best of our knowledge,this is the first survey focused on VLP.We hope that this survey can shed light on future research in the VLP field.

作者 Fei-Long Chen Du-Zhen Zhang Ming-Lun Han Xiu-Yi Chen Jing Shi Shuang Xu Bo Xu

机构地区 Institute of Automation School of Future Technology School of Artificial Intelligence

出处《Machine Intelligence Research》 EI CSCD 2023年第1期38-56,共19页 机器智能研究（英文版）

基金 supported by the Key Research Program of the Chinese Academy of Sciences(No.ZDBSSSW-JSC006) the Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDA 27030300).

关键词 Vision and language pre-training TRANSFORMERS multimodal learning representation learning

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

同被引文献10

1蔡润芳.人机社交传播与自动传播技术的社会建构——基于欧美学界对Socialbots的研究讨论[J].当代传播,2017(6):53-58. 被引量：37
2王亚珅.2020年深度学习技术发展综述[J].无人系统技术,2021,4(2):1-7. 被引量：3
3包希港,周春来,肖克晶,覃飙.视觉问答研究综述[J].软件学报,2021,32(8):2522-2544. 被引量：11
4陈烨,周刚,卢记仓.多模态知识图谱构建与应用研究综述[J].计算机应用研究,2021,38(12):3535-3543. 被引量：33
5孙水发,李小龙,李伟生,雷大江,李思慧,杨柳,吴义熔.图神经网络应用于知识图谱推理的研究综述[J].计算机科学与探索,2023,17(1):27-52. 被引量：11
6梅宏,杜小勇,金海,程学旗,柴云鹏,石宣化,靳小龙,王亚沙,刘驰.大数据技术前瞻[J].大数据,2023,9(1):1-20. 被引量：23
7陈虹,张文青.Twitter社交机器人在涉华议题中的社会传染机制——以2022年北京冬奥会为例[J].新闻界,2023(2):87-96. 被引量：9
8Xuehong Wu,Junwen Duan,Yi Pan,Min Li.Medical Knowledge Graph:Data Sources,Construction,Reasoning,and Applications[J].Big Data Mining and Analytics,2023,6(2):201-217. 被引量：4
9王欢,宋丽娟,杜方.基于多模态知识图谱的中文跨模态实体对齐方法[J].计算机工程,2023,49(12):88-95. 被引量：2
10张肇聿,王一琳,李志.基于人工智能技术的25个行业发展趋势[J].无人系统技术,2019,2(1):17-22. 被引量：4

引证文献3

1王亚珅,李强,石戈,鞠卓亚,刘晨羽.ChatGPT对社交机器人技术发展的影响分析[J].无人系统技术,2023,6(2):95-102. 被引量：3
2朱贵德,黄海.文本视觉问答综述[J].计算机工程,2024,50(2):1-14.
3陈囿任,李勇,温明,孙驰.多模态知识图谱融合技术研究综述[J].计算机工程与应用,2024,60(13):36-50.

二级引证文献3

1王亚珅,方勇,江昊,曾园园,白然.2023年生成式人工智能技术主要发展动向分析[J].无人系统技术,2024,7(2):101-112.
2王梓屹,简萌,李彬,孙新.2023年可解释人工智能技术主要发展动向分析[J].无人系统技术,2024,7(2):113-120.
3邵雷,石峰.生成式人工智能对社交机器人的影响与治理对策[J].情报杂志,2024,43(7):154-163.

1Aakash Shah,Manan Shah.Advancement of deep learning in pneumonia/Covid-19 classification and localization:A systematic review with qualitative and quantitative analysis[J].Chronic Diseases and Translational Medicine,2022,8(3):154-171. 被引量：1
2钟维幸,王海荣,王栋,车淼.多模态语义协同交互的图文联合命名实体识别方法[J].广西科学,2022,29(4):681-690. 被引量：3
3Vicky Huang.Language as a Bridge,Not a Barrier[J].Beijing Review,2023,66(5):48-48.
4卢亮,孙敏.CT血管成像技术与灌注成像技术对急性缺血性脑卒中侧支循环诊断敏感度与准确率的影响[J].世界复合医学,2022,8(11):1-5.
5《景观设计学》2022年投稿指南[J].景观设计学（中英文）,2022,10(2):100-100.
6Ágnes Cziráki.Structure of the Quarks and a New Model of Protons and Neutrons: Answer to Some Open Questions[J].Natural Science,2023,15(1):11-18.
7吴迪,吴乃龙,石红瑞.Object Grasping Detection Based on Residual Convolutional Neural Network[J].Journal of Donghua University(English Edition),2022,39(4):345-352.
8Jiayin Lin,Geng Sun,Jun Shen,David E.Pritchard,Ping Yu,Tingru Cui,Dongming Xu,Li Li,Ghassan Beydoun.From computer vision to short text understanding: Applying similar approaches into different disciplines[J].Intelligent and Converged Networks,2022,3(2):161-172.
9Correction to:An ultra‑compact polarization‑insensitive slot‑strip mode converter[J].Frontiers of Optoelectronics,2022,15(3):73-74.
10C.Voyant,M.L.Nivet,CPaoli,M.Muselli,G.Notton.Heterogeneous transfer functions multi-layer perceptron (MLP) for meteorological time series forecasting[J].International Journal of Modeling, Simulation, and Scientific Computing,2015,6(2):49-55.

Machine Intelligence Research

2023年第1期

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...

;

使用帮助返回顶部