摘要
相似情绪类别识别混乱导致识别效果下降的问题一直是多模态情绪识别任务的一大挑战。针对此问题,提出一个基于聚类群组归一化的关系图神经网络模型方法。首先使用3个不同特征提取器提取出3种模态特征,并融入说话者编码后进行拼接,既丰富特征表示又保留原始信息;其次使用Transformer提取上下文信息;最后将特征节点输入关系图卷积神经网络后,通过对节点进行聚类分组,并独立地进行群组归一化,使相似节点更加相似,缓解相似情绪容易识别混乱的问题。通过实验验证,提出的网络模型在IEMOCAP数据集四分类上的F1值可达到86.34%,验证该方法的有效性,并且目前该模型达到IEMOCAP数据集的最佳性能。
It is a challenge for multimodal emotion recognition task that the confusion of similar emotion categories recognition leads to a decrease in recognition effect.To address this problem,a neural network modeling approach for relational graphs is proposed based on clustering group normalization.Firstly,three modal features are extracted using three different feature extractors and spliced by incorporating speaker encoding,which enriches the feature representation and preserves the original information.Secondly,contextual information is extracted using Transformer.Finally,after the feature nodes are input into the relational graph convolutional neural network,the nodes are clustered and grouped by clustering and independently normalized to make similar nodes more similar,which alleviates the problem that similar emotions are difficult to delimit.Through experimental validation,the network model can reach an 86.34%F1-score on the IEMOCAP dataset four classification,which verifies the effectiveness of the method in this paper.At present,the model achieves the best performance on this dataset.
作者
罗奇
苟刚
LUO Qi;GOU Gang(State Key Laboratory of Public Big Data,Guizhou University,Guiyang 550025,Guizhou,China;College of Computer Science and Technology,Guizhou University,Guiyang 550025,Guizhou,China)
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2024年第7期105-112,共8页
Journal of Shandong University(Natural Science)
基金
国家自然科学基金资助项目(62162010)
贵州省科技支撑计划资助项目(黔科合支撑[2022]一般267)。
关键词
图神经网络
特征融合
群组归一化
聚类
对话情绪识别
graph neural network
feature fusion
group normalization
cluster
conversation emotion recognition