期刊文献+

基于深度学习算法的藏文微博情感计算研究 被引量:6

Research on Tibetan Micro-blog Affective Computation Based on Deep Learning Algorithm
下载PDF
导出
摘要 针对藏文文本情感计算研究,将CNN-LSTM深度学习模型引入到藏文微博情感计算,弥补了少数语言自然语言处理研究的缺乏,对藏文研究具有一定的推动作用。针对藏文语料的不公开,通过藏文同反义情感词典对标注好的藏文微博语料中情感词汇的同反义词进行替换,进一步扩充了藏文微博语料,以适合深度学习对大数据语料的要求。藏文微博分词后,利用Word2vec工具训练出藏文微博词向量模型,提高特征向量对文本深层次语义信息的表达;然后,将训练好的词向量和对应的情感倾向标签直接引到由卷积层、池化层、LSTM层、全连接层等构成的CNN-LSTM模型,在每一层的输出做归一化处理;最后经过Softmax分类器对藏文微博进行情感倾向分类,并与LSTM以及传统的情感词典做了实验对比。结果表明,该算法获得了较好的分类效果。 Aiming at the study of Tibetan text emotion calculation,the CNN-LSTM deep learning model is introduced into Tibetan micro-blog emotion calculation,which makes up for the lack of research on minority language natural language processing,and has certain impetus to Tibetan studies.For the non-disclosure of Tibetan corpus,the Tibetan and the anti-sense sentiment dictionary are used to replace the antonyms of the emotional vocabulary in the Tibetan micro-blog corpus,further expanding the Tibetan micro-blog corpus to meet the requirements of deep learning to big data.After the Tibetan micro-blog’word segmentation,the Word2vec tool is used to train the Tibetan micro-blog’word vector model to improve the expression of the deep vector semantic information of the feature vector.Then,the trained word vector and the corresponding emotional tendency label are directly introduced into the CNN-LSTM model consisting of convolutional layer,pooling layer,flatten layer,LSTM layer,and the output at each layer will be batch normalization.Finally,the Softmax Classifier is used to affect the Tibetan micro-blog.Compared with LSTM and traditional sentiment lexicon,it shows that the proposed algorithm achieves better classification effect.
作者 孙本旺 田芳 SUN Ben-wang;TIAN Fang(Department of Computer Technology and Applications,Qinghai University,Xining 810016,China;Information Technology Center,Qinghai University,Xining 810016,China)
出处 《计算机技术与发展》 2019年第10期55-58,99,共5页 Computer Technology and Development
基金 国家自然科学基金(61461045) 青海省科技计划项目(2016-ZJ-743)
关键词 深度学习 藏文微博 词向量 情感计算 deep learning Tibetan micro-blog word vector emotional calculation
  • 相关文献

参考文献5

二级参考文献33

  • 1朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 2徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量:122
  • 3唐慧丰,谭松波,程学旗.基于监督学习的中文情感分类技术比较研究[J].中文信息学报,2007,21(6):88-94. 被引量:136
  • 4徐军,丁宇新,王晓龙.使用机器学习方法进行新闻的情感自动分类[J].中文信息学报,2007,21(6):95-100. 被引量:107
  • 5B.Pang,L.Lee.Seeing stars:Exploiting class relationships for sentiment categorization with respect to rating scales[C]Proceedings of the ACL,2005:115-124.
  • 6Y.Bengio,R.Ducharme,P.Vincent,et al.A neural probabilistic language model[J].Journal of Machine Learning Research,2003,3:1137-1155.
  • 7Collobert R,Weston J.A unified architecture for natural language processing:Deep neural networks with multitask learning[C]//Proceedings of the 25th international conference on Machine learning.ACM,2008:160-167.
  • 8Mnih A,Hinton G E.A Scalable Hierarchical Distributed Language Model[C]//Proceedings of NIPS.2008::1081-1088.
  • 9Mikolov T,Karafiát M,Burget L,et al.Recurrent neural network based language model[C]//Proceedingsof INTERSPEECH.2010:1045-1048.
  • 10Mikolov T,Kombrink S,Burget L,et al.Extensions of recurrent neural network language model[C]//Proceedings of Acoustics,Speech and Signal Processing(ICASSP),2011 IEEE International Conference on.IEEE,2011:5528-5531.

共引文献138

同被引文献33

引证文献6

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部