摘要
提出了一种新的特征提取方式,与三支决策思想相结合,运用在文本情感分析中,以提高分类器的效率。根据训练集合创建动态情感词典,然后根据情感词典提取文本的抽象特征,形成特征矩阵。在分类过程中,如果分类器对于目标文本的所属分类确信程度不够高,那么分类器会利用三支决策的思想,将文本置于边界域中,等待别的处理方法。实验结果表明,在英文影评数据集上,基于动态词典的特征提取方法可以取得更好的分类准确率,而且三支决策规则可将一些样例放入边界域,提高了分类准确率。
A new way of feature extraction and the concept of three-way decision was utilized in traditional text senti- ment analysis methods in order to boost the classification accuracy. In the new method, a dynamic lexicon was intro- duced according to the training set and was utilized to extract abstract features for every piece of text to form the feature matrix. Besides, in the classification process, target texts with which the classifier had low confidence of sentiment la- bels were put into the boundary region for later decision. Experimental results showed that the method reached better re- sults with the help of dynamic sentiment lexicon, and the three-way decision also raised the accuracy of classification.
出处
《山东大学学报(工学版)》
CAS
北大核心
2015年第1期19-23,共5页
Journal of Shandong University(Engineering Science)
基金
国家自然科学基金面上项目(61170180)
关键词
情感分析
观点挖掘
文本数据挖掘
特征抽取
三支决策
sentiment analysis
opinion mining
text data mining
feature extraction
three-way decision