摘要
传统情感分类研究须手工标注情感特征,未考虑语料中词语间深层语义关系,且不能对隐式情感语句进行分类的弊端,为此提出基于深度学习的维吾尔语语句隐式情感分类算法。通过word2vec获得每个词语的向量表示,用于表示语料中词语间的深层语义关系;将词向量与词性特征线性组合,产生句子的向量表示;利用栈式自编码,从大规模无标注隐式情感向量表示中自动学习特征,通过softmax分类器完成维吾尔语语句分类。该方法准确率达到90%以上,验证了深度学习在隐式情感分类任务上的有效性。
Traditional sentiment classification research needs hand-engineered emotion features,ignores the deep semantic relations among words in the corpus,and cannot classify implicit sentiment sentences.According to these problems,a Uyghur implicit sentiment classification algorithm based on deep learning was proposed.To express deep semantic relationship among words in the corpus,vector representations of each word were obtained using word2 vec.Vector representations of sentences were generated by the linear combination of word vectors and syntactic characters.Features were learnt automatically from large-scale unlabeled implicit sentiment vector representations by combining stacked autoencoder.Uyghur sentences sentiment classification was finished using softmax classifier.The accuracy reaches above 90%,which verifies the efficiency of deep learning on sentiment classification task.
出处
《计算机工程与设计》
北大核心
2016年第9期2577-2580,F0003,共5页
Computer Engineering and Design
基金
国家自然科学基金项目(61563051
61262064
60963017)
国家自然科学基金重点项目(61331011)
关键词
维吾尔语
隐式情感
词向量
深度学习
栈式自编码
Uyghur
implicit sentiment
word embedding
deep learning
stacked autoencoder