摘要
短文本情感分类是一种面向主观信息分类的文本分类任务,具有重要的研究价值和广泛的应用前景,如旅游景区口碑评价、舆情跟踪、产品声誉分析等。为了提高短文本情感分类准确率,文章提出了一种基于Stacking融合深度学习模型和传统机器学习模型的短文本情感分类方法。该方法从短文本数据集分别提取TFIDF和Word2Vec特征,并作为传统机器学习模型和深度学习模型的输入,再基于Stacking技术将多个基分类器(包括Logistic,Passive Aggressive,Ridge,SVC,SVR等传统机器学习模型和深度学习文本分类模型TextRCNN)的分类结果进行融合处理,得到短文本情感分类的最终结果。该方法采用LightGBM作为Stacking最后一层的分类器,基于旅游景区网络评论数据集进行了验证。实验结果表明,该方法能够获得比最好基分类方法更好的分类效果,而且对积极、中性和消极三类情感文本的平均分类准确率达到了71.02%。
Short text sentiment classification is a text classification task oriented to subjective information classification.It has important research value and broad application prospects,such as reputation evaluation of tourist attractions,public opinion tracking,and product reputation analysis.In order to improve the accuracy of short text sentiment classification,this paper proposes a short text sentiment classification method based on Stacking fusion deep learning model and traditional machine learning model.The method extracts TFIDF and Word2Vec features from short text datasets and uses them as input to traditional machine learning models and deep learning models.Based on Stacking technology,multiple base classifiers(including Logistic,Passive Aggressive,Ridge,SVC,SVR,etc.)The classification results of the traditional machine learning model and the deep learning text classification model TextRCNN are merged to obtain the final result of the short text sentiment classification.This method uses LightGBM as the classifier of the last layer of Stacking,which is verified based on the travel scenic network comment data set.The experimental results show that the proposed method can obtain better classification results than the best base classification method,and the average classification accuracy rate of positive,neutral and negative emotional texts reaches 71.02%.
作者
周青松
范兴容
Zhou Qingsong;Fan Xingrong(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Institute of Electronic Information and Network Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处
《无线互联科技》
2018年第24期63-65,共3页
Wireless Internet Technology
基金
重庆市自然科学基金资助
项目编号:cstc2018jcyjAX0587