摘要
目前,在多模态情感分析任务上,存在着单一模态特征提取不充分、数据融合方法缺乏稳定性的问题.本文提出一种利用插值优化模态特征的方法,用于解决这些问题.首先利用插值优化BERT和GRU模型提取特征的方式,并使用这两种模型挖掘文本、音频、视频的信息.其次,用改进的注意力机制融合文本、音频和视频信息,从而更稳定地实现模态融合.该方法在MOSI和MOSEI数据集上进行实验.实验结果表明,使用插值能够在优化模态特征的基础上,提高对多模态情感分析任务的准确率,该结果验证了插值的有效性.
Currently,in multimodal sentiment analysis tasks,there are problems such as insufficient single modal feature extraction and lack of stability in data fusion methods.This study proposes a method of optimizing modal features that uses interpolation to solve these problems.Firstly,the interpolation-optimized BERT and GRU models are applied to extract features,and both of the models are used to mine text,audio,and video information.Secondly,an improved attention mechanism is used to fuse text,audio,and video information,thus achieving modal fusion more stably.This method is tested on the MOSI and MOSEI datasets.The experimental results show that using interpolation can improve the accuracy of multi-modal sentiment analysis tasks based on optimizing modal features.This result verifies the effectiveness of interpolation.
作者
唐业凯
冯广
杨芳捷
林浩泽
TANG Ye-Kai;FENG Guang;YANG Fang-Jie;LIN Hao-Ze(School of Computer Science,Guangdong University of Technology,Guangzhou 510006,China;School of Automation,Guangdong University of Technology,Guangzhou 510006,China)
出处
《计算机系统应用》
2024年第10期255-262,共8页
Computer Systems & Applications
基金
国家自然科学基金重点项目(62237001)
广东省哲学社会科学青年项目(GD23YJY08)。
关键词
插值
特征提取
注意力机制
模态融合
情感分析
interpolation
feature extraction
attention mechanism
modal fusion
sentiment analysis