摘要
提出了一种基于Pre-LN Transformer的静态多模态情感分类模型。该模型首先利用Pre-LN Transformer结构中的编码器提取评论文本中的语义特征,其中编码器的多头自注意力机制允许模型在不同的子空间内学到相关情感信息。然后根据ResNet提取评论的图像特征,在特征水平融合的基础上通过视觉方面注意力机制来指导文本的情感分类,实现在线评论的静态多模态情感分析。最后在Yelp数据集上执行情感分类的实验结果表明:所提出的模型在准确率上相比于BiGRU-mVGG、Trans-mVGG模型分别提高了1.34%、1.10%,验证了该方法的有效性和可行性。
This paper proposes a static multi-modal sentiment classification model based on Pre-LN Transformer.This model firstly extracts semantic features from reviews using the encoder in Pre-LN Transformer structure,in which the multi-head self-attention mechanism allows the model to learn relevant emotional information in different subspaces.Then our model extracts the image features according to ResNet in the reviews.On the basis of feature level fusion,the visual attention mechanism guides the sentiment classification of text and realizes the static multimodal sentiment analysis of online reviews.Experimental results show that our model improves the performance by 1.34%and 1.10%in evaluation accuracy than BiGRU-mVGG and Trans-mVGG on Yelp datasets,which verifies the effectiveness and feasibility of the proposed model.
作者
王开心
徐秀娟
刘宇
赵哲焕
赵小薇
WANG Kaixin;XU Xiujuan;LIU Yu;ZHAO Zhehuan;ZHAO Xiaowei(School of Software Technology,Dalian University of Technology,Dalian 116620,Liaoning,China;Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province,Dalian University of Technology,Dalian 116620,Liaoning,China)
出处
《应用科学学报》
CAS
CSCD
北大核心
2022年第1期25-35,共11页
Journal of Applied Sciences
基金
国家自然科学基金(No.61672128)资助
关键词
情感分析
静态多模态
在线评论
视觉方面注意力
sentiment analysis
static multimodal
online reviews
visual aspect attention