摘要
情感分布学习(emotion distribution learning, EDL)采用情感分布记录给定样本在各个情绪上的表达程度,在处理具有模糊性的多标签情绪分析任务时具有明显优势。情感分布标签增强技术将已标注的情绪单标签增强为情感分布,可以解决EDL缺乏已标注情感分布的实验数据集的问题。然而,已有的情感分布标签增强方法采用离散空间情绪模型表示情绪,存在情绪间的相关信息丢失和情绪表达不连续等问题。针对上述问题,该文引入基于连续维度的效价-唤醒-支配(valence-arousal-dominance, VAD)心理学情绪模型,提出融合VAD情绪知识的文本情感分布标签增强方法(VAD emotion knowledge-based text emotion distribution label enhancement, VADLE)。VADLE方法基于先验的VAD情绪模型中的情绪距离,先为英文句子的真实情绪标签和句中情感词的情绪标签分别生成先验情感分布,再通过分布叠加将2种先验情感分布统一。通过英文单标签文本情感数据集的对比实验表明:VADLE方法在情绪预测任务方面的性能优于已有的情感分布标签增强方法。
[Objective]Existing emotion distribution label enhancement(EDLE)methods construct the emotion distribution based on a discrete spatial emotion model;hence,expressing the correlation between emotions in a granular manner with continuous values is challenging.Therefore,herein,a valence-arousal-dominance(VAD)emotion knowledge-based text emotion distribution label enhancement(VADLE)method is proposed based on the VAD continuous-dimensional psychology emotion model.Unlike existing EDLE methods,VADLE uses VAD emotion knowledge in a three-dimensional continuous space to model emotion correlations and generate a more nuanced emotion distribution.The VADLE method comprises several steps:(1)Extraction of emotion word information via referencing lexicon and extracting emotion words from a given sentence.(2)Generation of priori emotion distributions for emotion labels using a local linear-weighting algorithm.The algorithm measures the effect of secondary emotion on the primary emotion based on the VAD emotional spatial distance and assigns weights to nearby emotions using a Gaussian kernel.(3)Construction of sentence-level emotion distribution by combining the prior emotion distributions of sentence and textual emotion words.Furthermore,this study uses joint loss to train a multitask emotion distribution learning model based on the robustly optimized bidirectional encoder representations from transformers pretraining approach(RoBERTa)pretrained language model.This approach simultaneously optimizes the prediction of emotion distribution and classification.The sentence text features extracted using the RoBERTa pretrained model are then passed through a fully connected layer to generate a probability distribution over all emotion labels.Based on this probability distribution,the model utilizes the Kullback-Leibler(KL)loss for measuring the distance between the predicted and actual distributions,optimizing the emotion distribution prediction task.Simultaneously,cross-entropy loss is employed for optimizing the emotion recognition task.To evaluate the performance of the proposed VADLE method,extensive comparative experiments is performed on several single-label English datasets using four baseline EDLE methods:emotion wheel and lexicon-based emotion distribution label enhancement(EWLLE),lexicon-based emotion distribution label enhancement(LLE),Mikels emotion wheel-based emotion distribution label enhancement(MWLE),and One-Hot.Moreover,this study explores the effect of the bandwidth parameter(τ)in the local linear-weighting algorithm on the balance between the primary and secondary emotions in the generated emotion distribution.The performance of the model's emotion prediction was assessed using four classification evaluation metrics(Precision,Recall,F1-score,and Accuracy)and four emotion distribution prediction metrics(Canberra,Chebyshev,Cosine,and Intersection).The experimental results demonstrated that the VADLE method was superior to the baseline methods.Specifically,the VADLE method achieved superior performance on the emotion classification task over the EWLLE,LLE,and MWLE methods across all four indicators.The VADLE method also exhibited excellent performance for the emotion distribution prediction task.For instance,on the Cosine metric,the VADLE method outperformed the suboptimal EWLLE method by 2.6%and exhibited considerable improvements over the LLE,MWLE,and One-Hot methods.The results showed that the optimal balance could be achieved by settingτto 0.6,enabling the highest level of performance in the emotion distribution generation.Unlike existing EDLE methods,the VADLE method employs a fine-grained approach to studying emotions.It combines the prior emotion knowledge in the VAD continuous space with the linguistic information inherent in the sentiment words for generating more reasonable emotion distributions.Experimental results reveal that the VADLE method outperforms existing methods in terms of enhancing the emotion distribution labels in emotion prediction tasks.
作者
王耀琦
万中英
曾雪强
左家莉
WANG Yaoqi;WAN Zhongying;ZENG Xueqiang;ZUO Jiali(School of Computer and Information Engineering,Jiangxi Normal University,Nanchang 330022,China)
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2024年第5期789-800,共12页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金地区科学基金项目(62266021)
江西省教育厅科学技术研究项目(GJJ2200330)。
关键词
情感分布标签增强
情感分布学习
VAD情绪空间
情感词典
emotion distribution label enhancement
emotion distribution learning
VAD emotion space
affective lexicon