期刊文献+

基于模态类别的多模态信息处理与融合综述

Survey on Multimodal Information Processing and Fusion Based on Modal Categories
下载PDF
导出
摘要 随着人工智能和深度学习技术的不断发展,多模态信息处理与融合领域的相关研究受到了研究者们的广泛关注。本文总结多模态信息处理的发展历史和里程碑式的工作,以及多模态融合策略和模型。根据模态类别的不同,分类整理多模态信息处理与融合的主流数据集。以模态类型作为分类标准,本文系统地梳理多模态信息处理与融合的研究进展,强调不同模态之间的区别,并将多模态信息处理与融合分为:视听处理与融合、声文处理与融合、视觉-文本处理与融合和视觉-音频-文本处理与融合4种类别,对不同输入模态的处理融合方法与模型进行详细的研究。最后针对多模态处理与融合领域的发展进行总结与展望。 With the continuous advancement of artificial intelligence and deep learning technologies,research in the field of mul⁃timodal information processing and fusion has garnered widespread attention from researchers.This paper provides a comprehen⁃sive overview of the development history and milestone works of multimodal information processing,along with strategies and models for multimodal fusion.Based on different modalities,mainstream datasets for multimodal information processing and fu⁃sion are systematically classified and summarized.Using modality type as the classification criterion,this paper systematically re⁃views the research progress in multimodal information processing and fusion,emphasizing the distinctions between different mo⁃dalities.Multimodal information processing and fusion are categorized into four types:audio-visual processing and fusion,audio-text processing and fusion,visual-text processing and fusion,and visual-audio-text processing and fusion.Detailed in⁃vestigations are conducted on methods and models for processing and fusing different input modalities.Finally,a summary and outlook on the development of multimodal processing and fusion are provided.
作者 黄文栋 王怡凡 HUANG Wendong;WANG Yifan(College of Computer Science and Technology,China University of Petroleum(East China),Qingdao 266580,China)
出处 《计算机与现代化》 2024年第7期47-62,共16页 Computer and Modernization
基金 山东省自然科学基金资助项目(ZR202211180156)。
关键词 多模态处理 多模态信息处理 多模态融合 深度学习 multimodal processing multimodal information processing multimodal fusion deep learning
  • 相关文献

参考文献2

二级参考文献5

共引文献70

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部