摘要
近年来,文本方面级的细粒度情感分析受到了越来越多的重视,并且在医疗文本方面的作用也越来越大。与粗粒度情感分析相比,细粒度情感分析可以区分医疗文本的每个具体方面词,并且可以得到每个方面词所表达的情感信息。方面级情感分析任务需要考虑方面词和情感词之间的交互,而医疗文本既可作为方面词,又可作为情感词。因此,提出了一个包含上下文位置潜在信息的方面级情感分析模型,实现对于医疗文本信息的情感分析。医疗文本中与特定方面词情感极性判断相关的上下文词一般位于该方面词的附近,而且由于医疗方面词的上下文的词数量存在差异,可能会导致词嵌入向量表示的属性变化,使得方面词的相对位置会有所不同。因此,提出了一种新的上下文位置调整函数,通过调整上下文词在不同位置的权重,增强与指定方面词相关的情感极性词的针对性,减轻方面词两侧词数差异对情感极性判断的干扰。同时,为了将包含特定方面的情感信息的方面词以向量表示,引入了一个线性条件随机场模型辅助建立方面词向量表示的模型。最终,使用焦点损失函数来训练模型参数,处理医疗文本中的情感分析的类不平衡问题。
In recent years, text level fine-grained sentiment analysis has received more and more attention, and its role in medical text is also growing. Compared with coarse-grained sentiment analysis, aspect-level sentiment analysis can distinguish each specific aspect word of the medical text, and can get the emotional information expressed by each aspect word. Aspect-level sentiment analysis task need consider the interaction between aspect words and sentiment words. Medical texts can be used as both aspect words and sentiment words. So in this paper, we propose an aspect-level sentiment analysis model with Contextual Position Latent Information. At the same time, the context words related to the sentiment polarity judgment of specific aspect words in the medical text are generally located near the aspect words. Moreover, due to the difference in the number of words in the context of medical aspect words, the attribute of word embedding vector representation may change, making the relative position of aspect words different. Therefore, this paper proposes a new context location adjustment function to enhance the pertinence of the sentiment polarity words related to the designated aspect words, and reduce the interference of the number of words on both sides of the aspect words on the judgment of sentiment polarity. At the same time, a linear conditional random field model is introduced to help establish the vector representation model of aspect words in order to express the aspect words containing the emotional information of a specific aspect. Finally, the focus loss function is used to train the model parameters to deal with the class imbalance of sentiment analysis in medical texts.
作者
王萍
李璋寅
郭茹燕
黄勃
王董祺
WANG Ping;LI Zhangyin;GUO Ruyan;HUANG Bo;WANG Dongqi(School of Continuing Education,Shanghai University of Engineering Science,Shanghai 201620,China;School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China)
出处
《武汉大学学报(理学版)》
CAS
CSCD
北大核心
2023年第1期60-68,共9页
Journal of Wuhan University:Natural Science Edition
基金
上海市2021年度“科技创新行动计划”社会发展科技攻关项目(21DZ1204900)。
关键词
自然语言处理
方面级情感分析
语境权重调整
焦点损失函数
natural language process
aspect-level sentiment analysis
context weight adjustment
focal loss function