期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
A Survey of Vision and Language Related Multi-Modal Task
1
作者 Lanxiao Wang Wenzhe Hu +5 位作者 heqian qiu Chao Shang Taijin Zhao Benliu qiu King Ngi Ngan Hongliang Li 《CAAI Artificial Intelligence Research》 2022年第2期111-136,共26页
With the significant breakthrough in the research of single-modal related deep learning tasks,more and more works begin to focus on multi-modal tasks.Multi-modal tasks usually involve more than one different modalitie... With the significant breakthrough in the research of single-modal related deep learning tasks,more and more works begin to focus on multi-modal tasks.Multi-modal tasks usually involve more than one different modalities,and a modality represents a type of behavior or state.Common multi-modal information includes vision,hearing,language,touch,and smell.Vision and language are two of the most common modalities in human daily life,and many typical multi-modal tasks focus on these two modalities,such as visual captioning and visual grounding.In this paper,we conduct in-depth research on typical tasks of vision and language from the perspectives of generation,analysis,and reasoning.First,the analysis and summary with the typical tasks and some pretty classical methods are introduced,which will be generalized from the aspects of different algorithmic concerns,and be further discussed frequently used datasets and metrics.Then,some other variant tasks and cutting-edge tasks are briefly summarized to build a more comprehensive vision and language related multi-modal tasks framework.Finally,we further discuss the development of pre-training related research and make an outlook for future research.We hope this survey can help relevant researchers to understand the latest progress,existing problems,and exploration directions of vision and language multi-modal related tasks,and provide guidance for future research. 展开更多
关键词 deep learning vision and language multi-modal generation multi-modal analysis multi-modal reasoning pre-training
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部