摘要
提出了一种基于文本模态指导的多模态层级自适应融合方法,以文本模态信息为指导实现多模态信息的层级自适应筛选及融合。首先,基于跨模态注意力机制实现两两模态之间的重要性信息表征;然后通过多模态自适应门控机制实现基于多模态重要信息的层级自适应融合;最后综合多模态特征和模态重要性信息实现多模态情感分析。在公共数据集MOSI和MOSEI上的实验结果表明:对比基线模型,本文所提方法在准确率与F_(1)值方面分别提升了0.76%和0.7%。
The paper proposes a multi-modal hierarchical fusion method based on text modal guidance,which uses text modal information as the guidance to achieve hierarchical adaptive screening and fusion of multi-modal information.Firstly,the importance information representation between two modalities is realized based on the cross-modal attention mechanism,then the hierarchical adaptive fusion based on the multimodal important information is realized through the multimodal adaptive gating mechanism,and finally the multimodal features are synthesized.and modal importance information to implement multimodal sentiment analysis.The experimental results on the public datasets MOSI and MOSEI show that the accuracy and F_(1)value of the baseline model have increased by 0.76% and 0.7%,respectively.
作者
卢婵
郭军军
谭凯文
相艳
余正涛
Chan LU;Junjun GUO;Kaiwen TAN;Yan XIANG;Zhengtao YU(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,Yunnan,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,Yunnan,China)
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2023年第12期31-40,51,共11页
Journal of Shandong University(Natural Science)
基金
国家重点研发计划资助项目(2020AAA0107904)
国家自然科学基金资助项目(62366025,62241604)
云南省科技厅基础研究专项面上项目(202301AT070444)。
关键词
多模态情感分析
多模态融合
注意力机制
门控网络
multimodal sentiment analysis
multimodal fusion
attention mechanism
gating network