摘要
为文本情感分类提出一种改进的机器学习算法。在分析当前主要文本特征选择方法后,把词频和词语情感表现程度融入到信息增益特征选择方法中,从全局和局部2个方面进行特征权重衡量,使用特征空间向量模型对文本进行统一表示,然后利用SVM算法进行训练学习。通过实验发现该算法的查准率和查全率比传统的机器学习算法有所提高,并且得到的分类器具有较好的泛化能力。
An improved machine learning algorithm for text sentiment classification has been proposed. After analyzing the current main text feature selection method, the word frequency and the word emotion degree are integrated into the information gain feature selection method and the feature weight is measured from the global and local aspects. The feature space vector model is used for united representation of text, and then the SVM algorithm is used for training and learning. Experimental results show that the precision and recall of the algorithm are better than those of the traditional machine learning algorithm, and the classifier has good generalization ability and effectiveness for text emotional classification.
作者
王根生
黄学坚
吴小芳
胡向亮
WANG Gensheng;HUANG Xuejian;WU Xiaofang;HU Xiangliang(Computer Practice Teaching Center,Jiangxi University of Finance and Economics,Nanchang 330013,China)
出处
《成都理工大学学报(自然科学版)》
CAS
CSCD
北大核心
2019年第1期105-110,共6页
Journal of Chengdu University of Technology: Science & Technology Edition
基金
国家自然科学基金项目(71461012)
关键词
情感分类
机器学习
SVM
信息增益
sentiment classification
machine learning
SVM
information gain