摘要
社交媒体信息的爆炸式增长,使得依据其对公众舆论情感的分析受到越来越多的关注。与传统文本不同,新浪微博中存在包括情感词、表情、图片和视频等特征在内的多情绪源,本文针对中文社交短文本情感分析中情感词典时效性问题和多情绪源间的关联性问题,提出了一种多情绪源关联模型。该模型考虑微博中的情感词和表情特征及其之间的关联关系,在经典的词典规则投票方法基础上,引入多情绪源以及关联概率,通过概率建模的方式对情感词和表情两类情绪源建立关联模型,实现对微博情感的判别。实验表明,在6 171条微博数据集中,多情绪源关联模型分类准确率达到了85.3%,强于包含情感词和表情的传统投票模型(83.4%)以及包含同类多特征的SVM方法(82.9%)。
With the explosion of social media information, sentiment analysis of public opinion is attracting moreand more attention. Compared with traditional text,the Sina micro-blog contains a variety of emotional sources,includingsentiment words,emoticons,pictures,etc. To solve the problem of the poor timeliness of lexicons in Chinesesocial short messages and to utilize the correlation between different emotional sources, an emotional multisourcecorrelation model (E M C M ) is proposed to carry out sentiment analysis on a micro-blog. In particular, ittakes advantage of the correlation between sentiment words and emoticons. It imports the multi-sources and correlationprobabilities, and then builds a correlation model between the two emotional sources, emotional words andemoticons,based on a voting model using sentimental words. Experimental results show that this model achieved anaccuracy of 85.3% in 6 171 micro-blogs,higher than either the traditional method based on voting (83.4%) or theS V M method based on similar multi-features (82.9%).
出处
《智能系统学报》
CSCD
北大核心
2016年第4期546-553,共8页
CAAI Transactions on Intelligent Systems
基金
国家自然科学基金项目(61202143
61305061
61402386
61572409)
福建省自然科学基金项目(2013J05100)
关键词
多模态情感分析
多情绪源
社交媒体
关联性
multi-modal sentiment analysis
emotional multi-sources
social media
correlation