摘要
评论文本情感分析现已成为自然语言处理的重要研究领域。针对评论文本语法不规则、特征稀疏的问题,设计了一种针对评论文本的多特征融合的情感分类算法。首先提出一种改进的情感规则方法;然后从规则方法中提取出有效信息,将每一个情感信息量扩展为多维向量,再融合一元词特征、句法特征以及依存词语搭配特征构成向量空间,形成更有效的融合特征模板;最后利用信息增益理论进行特征选择,作为支持向量机的输入对评论文本进行识别和分类,实现了机器学习方法与规则方法相融合。以中文酒店评论数据集作为语料进行实验,结果表明该方法能让机器学习算法更加充分地利用规则特征,相比单纯地使用规则方法或机器学习方法,能够达到更好的分类性能,进一步提高分类精度。
The analysis on text emotional inclination has received much attention from natural language processing filed in recent years.In order to solve the problem of grammatical irregularity and feature sparsity,we design an emotional classification approach based on multi-feature fusion for text sentiment.At first,an improved method based on emotional rules is proposed.Then the effective information extracted from the ruled-based method is extended to a multidimensional vector and an effective integration feature set is obtained by adding various rule-based features to the basic feature set after expanding and converting them.Finally,the information gain theory is used to select features as the input of SVM.Thus,a method via a combination of rule-based and machine learning method is realized.We use the Chinese hotel reviews data set as the corpus for the experiment which shows that this method can make machine learning algorithm more full use of the rule features and it works better than simply using rule-based method or machine learning method.
作者
龚安
费凡
GONG An;FEI Fan(School of Computer & Communication Engineering,China University of Petroleum,Qingdao 266580,Chin)
出处
《计算机技术与发展》
2018年第8期91-95,共5页
Computer Technology and Development
基金
国家油气重大专项(2017ZX05013-001)
关键词
文本情感分析
多特征融合
机器学习
情感规则
text sentiment analysis
multi-feature fusion
machine learning
emotional rules