摘要
文本情感分析是自然语言处理的热点问题之一,而词汇是情感分析的基础。汉字通过声音和形状表达意义,该文综合考虑词汇中每个字的部首和音位等信息,构建了一个情感词汇分类模型。在模型中,将词汇的字、部首和音位三种信息向量化,与原始词汇向量融合,生成新的情感词汇表示,最后采用前馈神经网络和卷积神经网络对情感词汇的极性进行分类。实验结果表明,三种细粒度特征都能有效地提高情感词汇的分类效果,并且该文在COAE评测的语料上验证了模型的有效性。
Text Sentiment Analysis,one of the hot topics in natural language processing, is based on the analysis of lexicon. Considering Chinese characters, the constituents of lexicon, convey their meaning through sounds and logo- graph,this paper aims at building a taxonomy of sentiment lexicon by the comprehensive analysis of the radicals and phonemes of each character. In our model,each Chinese character,radicals and phonemes are vectorized and then in- tegrated with the original word vector to generate new expressions of sentiment lexicon,and finally the polarities of sentiment lexicon are categorized with feedforward neural network, convolutional neural network and other approa- ches. Experiment results reveal that three types of vector features have effectively improved the accuracy of senti- ment lexicon classification,as well as a better sentiment sentence classification, results in COAE materials.
作者
徐琳宏
林鸿飞
祁瑞华
关菁华
XU Linhong;LIN Hongfei;QI Ruihua;GUAN Jinghua(School of Software,Dalian University of Foreign Languages,Dalian,Liaoning 116044,China;School of Computer Science and Technology,Dalian University of Technology,Dalian,Liaoning 116024,China)
出处
《中文信息学报》
CSCD
北大核心
2018年第6期124-131,共8页
Journal of Chinese Information Processing
基金
国家社会科学基金(15BYY028)
辽宁省自然科学基金(2015020017
20170540230
20170540232)
辽宁省优秀人才项目(LJQ2014127)
关键词
部首
音位
神经网络
radical
phoneme
neural network