摘要
针对多模态融合效果不佳,不能充分挖掘特定时间段,多视角关键情感信息的问题,提出了一种基于多视角的时序多模态情感分类模型,用于提取特定时间段,多视角下的关键情感信息。首先,对文本标题及文本内容两种视角下的数据进行低维空间词嵌入和序列表达,提取不同视角的多模态时序特征,对图片截取,水平镜像两种视角下的数据进行特征提取;其次,采用循环神经网络构建多模态数据的时序序列交互特征,增大互信息;最后,基于对比学习进行联合训练,完成情感分类。该模型在两个多模态情感分类基准数据集Yelp和Mutli-Zol上评估,准确度分别为73.92%、69.15%。综合实验表明,多视角的特定时间段多模态语句序列可提升模型性能。
Aiming at the problem that the multi-modal fusion effect is not good,and the key emotional information from specific time periods and multiple perspectives cannot be fully mined,this paper proposed a time-series multi-modal emotional classification model based on multiple perspectives to extract key information from multiple perspectives in a specific time period.Firstly,it performed low-dimensional spatial word embedding and sequence expression on the data from the two perspectives of text title and text content,extracted multi-modal time series features from different perspectives,and performed feature extraction on the data from two perspectives of image interception and horizontal mirroring.Secondly,it used the recurrent neural network to construct the time series interaction features of multi-modal data to increase mutual information.Finally,joint training was performed based on contrastive learning to complete sentiment classification.The model was evaluated on two multi-modal sentiment classification benchmark datasets Yelp and Mutli-Zol,with accuracies of 73.92%and 69.15%,respectively.Comprehensive experiments show that multi-view multi-modal sentence sequences in specific time periods can improve model performance.
作者
陶全桧
安俊秀
戴宇睿
陈宏松
黄萍
Tao Quanhui;An Junxiu;Dai Yurui;Chen Hongsong;Huang Ping(School of Software Engineering,Chengdu 610225,China;School of Management Chengdu University of Information Technology,Chengdu 610225,China)
出处
《计算机应用研究》
CSCD
北大核心
2023年第1期102-106,共5页
Application Research of Computers
基金
国家自然科学基金资助项目(71673032)
四川省社会科学高水平团队基金资助项目(2015Z177)。
关键词
情感分类
多模态
多视角
时序特征
对比学习
sentiment classification
multimodality
multiview
temporal features
contrastive learning