摘要
在模糊集合论框架下探索基于词语情感隶属度的情感极性分类特征表示方法。以TF-IDF为权重分别构建情感特征词语的正向、负向极性隶属度,并以隶属度对数比作为分类特征值构建基于支持向量机的情感极性分类系统。在产品评论、NLPCC2014情感分类评测数据和IMDB英文影评等数据上的实验结果表明,基于情感隶属度特征的系统优于基于布尔、频度和词向量等特征表示的系统,验证了所提出的基于情感隶属度特征表示的有效性。
A lexical sentiment membership based feature representation was presented for Chinese polarity classification under the framework of fuzzy set theory. TF-IDF weighted words are used to construct the corresponding positive and negative polarity membership for each feature word, and the log-ratio of each membership is computed. A support vector machines based polarity classifier is built with the membership logratios as its features. Furthermore, the classifier is evaluated over different datasets, including a corpus of reviews on automobile products, the NLPCC2014 data for sentiment classification evaluation and the IMDB film comments. The experimental results show that the proposed sentiment membership feature representation outperforms the state of the art feature representations such as the Boolean features, the frequent-based features and the word embeddings based features.
出处
《北京大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2016年第1期171-177,共7页
Acta Scientiarum Naturalium Universitatis Pekinensis
基金
国家自然科学基金(61170148)
黑龙江省人力资源和社会保障厅留学人员科技活动项目资助
关键词
情感极性分类
模糊集合论
隶属度
支持向量机
sentiment polarity classification
fuzzy sets
membership
supported vector machines