摘要
针对基于视频的多模态情感分析中,通常在同一语义层次采用同一种注意力机制进行特征捕捉,而未能考虑模态间交互融合对情感分类的差异性,从而导致模态间融合特征提取不充分的问题,提出一种基于注意力机制的分层次交互融合多模态情感分析模型(hierarchical interactive fusion network based on attention mechanism,HFN-AM),采用双向门控循环单元捕获各模态内部的时间序列信息,使用基于门控的注意力机制和改进的自注意机制交互融合策略分别提取属于句子级和篇章级层次的不同特征,并进一步通过自适应权重分配模块判定各模态的情感贡献度,通过全连接层和Softmax层获得最终分类结果。在公开的CMU-MOSI和CMU-MOSEI数据集上的实验结果表明,所给出的分析模型在2个数据集上有效改善了情感分类的准确率和F1值。
In video-based multimodal sentiment analysis,the same attention mechanism is usually used to capture features at the same semantic level,and the difference in sentiment classification by interaction fusion between modals is not considered,which leads to insufficient feature extraction of fusion between modals.In response to the above problems,this paper proposes a hierarchical interactive fusion based on attention mechanism(HFN-AM).Firstly,the bidirectional gated recurrent unit is used to capture the time series information within each modal,and then the interactive fusion strategy of gating-based attention mechanism and improved self-attention mechanism are used to extract different levels of features belonging to the sentences and document level respectively.Furthermore,the affective contribution degree of each mode is determined by the adaptive weight distribution module.Finally,the final classification result is obtained through the fully connected layer and the Softmax layer.Experimental results show that the accuracy and F1 value of the presented analysis model have achieved significant improvement on the public CMU-MOSI and CMU-MOSEI datasets,indicating that the model can effectively improve the performance of sentiment classification.
作者
李文雪
甘臣权
LI Wenxue;GAN Chenquan(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,P.R.China)
出处
《重庆邮电大学学报(自然科学版)》
CSCD
北大核心
2023年第1期176-184,共9页
Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)
基金
国家自然科学基金(61702066,61903056)
重庆市教委科学技术重点研究项目(KJZD-M201900601)。
关键词
多模态情感分析
注意力机制
分层次交互融合
multimodal sentiment analysis
attention mechanism
hierarchical interactive fusion