基于语篇解析和图注意力网络的对话情绪识别

Emotion recognition in conversations based on discourse parsing and graph attention network

下载PDF

导出

摘要对话情绪识别研究主要聚焦于融合对话上下文和说话者建模的相互关系。当前研究通常忽略对话内部存在的依存关系,导致对话的上下文联系不够紧密,说话者之间的关系也缺乏逻辑。因此,提出了一种基于语篇解析和图注意力网络(discourse parsing and graph attention network,DPGAT)的对话情绪识别模型,将对话内部的依存关系融入语境建模过程中,使语境信息更具有依赖性和全局性。首先,通过语篇解析获取对话内部的话语依存关系,构建语篇依存关系图和说话者关系图。随后,通过多头注意力机制将不同类型的说话者关系图进行内部融合。此外,在图注意力网络的基础上,结合依存关系进行循环学习,以达到上下文信息和说话人信息的有效融合,实现对话语境信息的外部融合。最终,通过分析内、外部融合的结果还原完整对话语境,并对说话者的情绪进行分析。通过在英文数据集MELD、EmoryNLP、DailyDialog和中文数据集M3ED上进行评估验证,F1分数分别为66.23%、40.03%、59.28%、52.77%,与主流的模型相比,所提模型具有较好的适用性,可在不同的语言场景中使用。 The research on emotion recognition in conversations(ERC)focuses on the interrelationship between conversational context and speaker modeling.The current research usually ignores the dependency within the conversation,which leads to the weak connection between the context of the conversation and the lack of logic between the speakers.Therefore,an emotion recognition in conversations model based on discourse parsing and graph attention network(DPGAT)was proposed to integrate the inter-dependency of conversation into the context modeling to make contextual information more dependent and global.Firstly,the discourse dependency relationships within the conversation were obtained through discourse parsing,and the discourse dependency graph and the speaker relationship graph were constructed.Subsequently,different types of speaker diagrams were internally integrated by multi-head attention mechanisms.Based on the graph attention network,cyclic learning was combined with dependency relationships to achieve the effective integration of contextual information and speaker information,realizing the external integration of context-related information in conversations.Finally,by analyzing the results of internal and external integration,the complete conversation context was restored,and the speaker’s emotions were analyzed.By evaluating and verifying on English dataset MELD,EmoryNLP,DailyDialog and Chinese dataset M3ED,F1 scores were 66.23%,40.03%,59.28%and 52.77%,respectively.Compared with mainstream models,the proposed model at least reaches state-of-the-art,and can be used in different language scenarios.

作者郝秀兰魏少华曹乾张雄涛 HAO Xiulan;WEI Shaohua;CAO Qian;ZHANG Xiongtao(Zhejiang Province Key Laboratory of Smart Management&Application of Modern Agricultural Resources,Huzhou University,Huzhou 313002,China;China Construction Bank Co.,Ltd.Huzhou Branch,Huzhou 313001,China)

机构地区湖州师范学院浙江省现代农业资源智慧管理与应用研究重点实验室中国建设银行股份有限公司湖州分行

出处《电信科学》北大核心 2024年第5期100-111,共12页 Telecommunications Science

基金国家自然科学基金资助项目(No.62376094) 湖州师范学院研究生科研创新项目(No.2024KYCX46,No.YJK24005)。

关键词对话情绪识别语篇解析图注意力网络 emotion recognition in conversations discourse parsing graph attention network

分类号 TP391 [自动化与计算机技术—计算机应用技术]