摘要
针对目前BBS网络信息杂乱的现象,提出了一种BBS情感分类方法,能够方便用户准确定位所需信息,辨识评论的极性(肯定还是否定)。根据词语具有语义倾向的概率大小,利用最大熵的特征模型识别文本中具有语义倾向的词语,选择具有一定倾向值的词作为文档的特征表示。通过这些类型特征构造支持向量机分类模型,对BBS文本所表达的情感等主观内容进行分类,判断其是正面还是负面。实验表明,在BBS情感分类中,基于该特征表示的分类精度较好。
Aiming at the phenomenon BBS network information is mess, present a high performance method to solve BBS sentiment classification problems. It can help people locate the required reviews in the BBS, end identify the comment is affirmatives of negatives. Based on the different probability whether the words have polarity,use maximum entropy to identify the words with polarity as features. Then use SVM classifier to deal with the texts in order to judge it is positive or negative. The experiments show that this method achieves a high performance.
出处
《计算机技术与发展》
2009年第7期120-123,共4页
Computer Technology and Development
基金
国家自然科学基金资助(60673060)
关键词
文本分类
情感分类
特征词识别
最大熵
支持向量机
text classification
sentiment classification
feature recognition
maximum entropy
SVM