摘要
文本情感极性分类是文本情感分析首先要解决的关键问题.在分析影响文本情感分类的各类因素的基础上,首先构建了情感词典,并进行情感特征选取以及情感特征加权,然后使用SVM分类的方法对文本进行情感识别及分类,最后在语料数据集的基础上,在单机平台上和Spark分布式计算平台上执行分类模型,对比分析其分类精度和时间代价.实验结果验证了本文构建的情感极性分类模型在单机和分布式云平台上中的有效性.
The key problem to solve in a sentiment analysis of texts is the sentiment polarity classifica-tion.Based on the analysis of various factors affecting sentiment classification of texts , it built the senti-ment lexicon , extracted affective characteristics , and weighted sentimental features .Then , it used sup-port vector machine ( SVM) classifier for emotion recognition and text classification .Finally, it performed the classification model with the corpus data sets on the single platform and the Spark distributed compu-ting platform to analyze its classification accuracy and time cost .The experimental results verify the effec-tiveness of the text sentimental polarity categorization model on the single platform and on the spark dis-tributed computing platform .
出处
《广东工业大学学报》
CAS
2014年第3期95-101,共7页
Journal of Guangdong University of Technology
基金
广东省自然科学基金资助项目(9151009001000007)
广东省科技计划项目(2012B091000173)
关键词
情感分类
支持向量机
Spark分布式计算平台
sentiment classification
support vector machine
Spark distributed computing platform