基于聚类和群组归一化的多模态对话情绪识别

Multimodal conversation emotion recognition based on clustering and group normalization

导出

摘要相似情绪类别识别混乱导致识别效果下降的问题一直是多模态情绪识别任务的一大挑战。针对此问题,提出一个基于聚类群组归一化的关系图神经网络模型方法。首先使用3个不同特征提取器提取出3种模态特征,并融入说话者编码后进行拼接,既丰富特征表示又保留原始信息;其次使用Transformer提取上下文信息;最后将特征节点输入关系图卷积神经网络后,通过对节点进行聚类分组,并独立地进行群组归一化,使相似节点更加相似,缓解相似情绪容易识别混乱的问题。通过实验验证,提出的网络模型在IEMOCAP数据集四分类上的F1值可达到86.34%,验证该方法的有效性,并且目前该模型达到IEMOCAP数据集的最佳性能。 It is a challenge for multimodal emotion recognition task that the confusion of similar emotion categories recognition leads to a decrease in recognition effect.To address this problem,a neural network modeling approach for relational graphs is proposed based on clustering group normalization.Firstly,three modal features are extracted using three different feature extractors and spliced by incorporating speaker encoding,which enriches the feature representation and preserves the original information.Secondly,contextual information is extracted using Transformer.Finally,after the feature nodes are input into the relational graph convolutional neural network,the nodes are clustered and grouped by clustering and independently normalized to make similar nodes more similar,which alleviates the problem that similar emotions are difficult to delimit.Through experimental validation,the network model can reach an 86.34%F1-score on the IEMOCAP dataset four classification,which verifies the effectiveness of the method in this paper.At present,the model achieves the best performance on this dataset.

作者罗奇苟刚 LUO Qi;GOU Gang(State Key Laboratory of Public Big Data,Guizhou University,Guiyang 550025,Guizhou,China;College of Computer Science and Technology,Guizhou University,Guiyang 550025,Guizhou,China)

机构地区贵州大学公共大数据国家重点实验室贵州大学计算机科学与技术学院

出处《山东大学学报（理学版）》 CAS CSCD 北大核心 2024年第7期105-112,共8页 Journal of Shandong University(Natural Science)

基金国家自然科学基金资助项目(62162010) 贵州省科技支撑计划资助项目(黔科合支撑[2022]一般267)。

关键词图神经网络特征融合群组归一化聚类对话情绪识别 graph neural network feature fusion group normalization cluster conversation emotion recognition

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1高峰.复杂网络深度重叠结构的发现[J].复杂系统与复杂性科学,2024,21(2):15-21.
2王立新,王亚飞,杨佳宇,李储军,汪珂,罗向龙.基于门控宽度模型的结构监测数据预测[J].科学技术与工程,2024,24(18):7719-7725.
3程继雄,安堃达,彭聃,王婉玉.湖北省典型流域片区地表水溶解氧时空变化及驱动因素[J].环境工程技术学报,2024,14(4):1260-1272.
4黎珂源,张清华,靳朋仁,谢秦.一种基于宽度学习系统变体结构的肺炎检测方法[J].西北大学学报（自然科学版）,2024,54(4):665-676.
5赵慧敏,郑建杰,郭晨,邓武.基于流形正则化框架和MMD的域自适应BLS模型[J].自动化学报,2024,50(7):1458-1471.
6曹小敏,刘进锋.基于因果推断的两阶段长尾分类研究[J].郑州大学学报（理学版）,2024,56(5):31-38.
7余香霖,张赫男,谭贻,刘艳芳,周帅,冯杰,张劲松,唐传红.高多糖含量灵芝新品种‘沪农灵芝4号’选育[J].食用菌学报,2024,31(4):55-63. 被引量：1
8马西锋,都杏妹,李柏秋,刁松源.区间概率语言环境下的大群体决策方法研究[J].系统科学与数学,2024,44(7):2031-2044.

山东大学学报（理学版）

2024年第7期

浏览历史

内容加载中请稍等...

基于聚类和群组归一化的多模态对话情绪识别

相关作者

相关机构

相关主题

浏览历史