摘要
随着互联网信息的高速发展,越来越多的人参与到信息的制造者队伍中,对于信息处理提出了更高的要求。计算文本的情感描述值对于衡量文本的极性信息具有重要的意义。首先对文本内容进行预处理,挑选出可以决定文本极性的句子;然后对各个子句进行情感描述值的计算;最后将子句的情感进行综合计算,得出文本的情感描述值。并且对文本长度、句法结构等因素进行了综合分析。实验结果表明,采用该算法计算文本信息具有较高的准确率和速度,对于大规模处理流数据情况下的情感信息值的计算具有较好的适用性。
Along with the rapid development of Internet information, more and more people join in as information maker and this has made a higher requirement for information processing. It’s important to calculate the sentiment description value of the text for evaluating the polarity of the text. Firstly, this paper preprocesses the text to choose the sentences that may best determine the polarity. Secondly, it calculates the sentiment description value of each sub-sentence. Finally, the total value of the combined results is got. This paper also conducts a comprehensive analysis of the elements that may influence the results, such as the length of the text, syntax structure, etc. The experimental results show that the proposed algorithm gets higher accuracy and higher speed, and has better applicability in the situation of large scale stream data environment.
出处
《计算机科学与探索》
CSCD
2014年第5期608-613,共6页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金Nos.61035003
61072085
61202212
60933004
国家重点基础研究发展计划(973计划)No.2013CB329502
国家高技术研究发展计划(863计划)No.2012AA011003
国家科技支撑计划No.2012BA107B02~~
关键词
情感描述值
情感分类
流数据
sentiment description value
sentiment classification
stream data