摘要
词汇的情感倾向判别对文本情感分类具有重要意义。已有方法多假设存在基准词,根据目标词与基准词的关联度来判别目标词的情感倾向。实际应用中,尤其是评论语料库中基准词往往存在情感歧义问题,从而影响判别结果的准确性。基于上述分析,面向给定语料库,提出一种基准词的提取和消歧方法,并在此基础上实现跨领域的词汇情感倾向判别。首先在任一标记语料库中自动提取候选基准词;然后基于共现矩阵评估并过滤部分具有情感歧义的基准词;最后通过计算基准词与目标词的相似性,实现目标词的情感倾向判别。实验结果表明了方法的有效性和可行性。
The sentiment orientation identification for word is important for text sentiment classification. The existing efforts identify the sentiment orientation of target words according to the similarity of the target words and paradigm words with the assumption that there are some paradigm words. However, the ambiguity of emotion of paradigm words in different corpus affects the results of sentiment classification. We proposed a novel method for cross-domain word sentiment orientation identification based on extracting paradigm words and eliminating the ambiguity of paradigm words in the given corpus. More specifically,we first extracted the candidate paradigm words automatically in a labeled corpus, and then filtered the paradigm words those involve in ambiguity of emotion based on the co-occurrence matrix of paradigm words and target words in the target domain. Finally through computing the similarity of paradigm words and target words,we identified the sentiment orientation of target words. The experimental results demonstrate the effectiveness of our method.
出处
《计算机科学》
CSCD
北大核心
2015年第6期220-222,238,共4页
Computer Science
基金
国家高技术研究发展计划(863计划)(2012AA011005)
国家自然科学基金(61273292
61305063)
安徽省自然科学基金(1208085QF122)资助
关键词
基准词
PMI
跨领域
情感倾向
情感歧义
Paradigm word
PMI
Cross-domain
Sentiment orientation
Ambiguity of emotion