摘要
中文情感分类一般分成基于情感词典和基于特征分类两种方法进行研究,但没有考虑过将两种方法得到的特征进行融合来提高分类效果。基于特征分类的方法忽视了特征词在情感词典的褒贬性以及词倾向性的强弱。用基于特征分类方法得到的文本特征建立朴素贝叶斯模型,根据特征词在情感词典中的褒贬性及其通过点对互信息方法得到的词性强弱调整情感词的正负后验概率权重,实现两种特征的融合,提高分类效果并降低了特征维数。
Generally the approach of Chinese text sentiment classification was based on the sentiment lexicon or the feature-selection,rather than the integration of the both involved to improve the classification effects.Feature-selection method ignored the emotional tendencies and value of words in the sentiment dictionary.This paper adopted the feature from the method of feature-selection to construct the naive Bayesian model,according to the emotional tendency of the feature in the sentiment dictionary and its value from point mutual information.And adjusted the weights of the positive and negative emotion word posterior probability to achieve the integration,improved the classification results and reduced the feature dimension.
出处
《计算机应用研究》
CSCD
北大核心
2012年第1期98-100,共3页
Application Research of Computers
基金
国家"211工程"三期建设项目(S-10218)
关键词
文本情感分类
情感词典
点对互信息
特征选择
朴素贝叶斯
text sentiment classification
semantic lexicon
point wise mutual information
feature-selection
naive Bayesian