期刊文献+

Improving sentiment analysis accuracy with emoji embedding

原文传递
导出
摘要 Due to the diversity and variability of Chinese syntax and semantics,accurately identifying and distinguishing individual emotions from online texts is challenging.To overcome this limitation,we incorporate a new source of individual sentiment,emojis,which contain thousands of graphic symbols and are increasingly being used for expressing emotion in online conversations.We examined popular sentiment analysis algorithms,including rule-based and classification algorithms,to evaluate the impact of supplementing emojis as additional features to improve the algorithm performance.Emojis were also translated into corresponding sentiment words when con-structing features for comparison with those directly generated from emoji label words.In addition,considering different functions of emojis in texts,we classified all posts in the dataset by their emoji usage and examined the changes in algorithm performance.We found that emojis are effective as expanding features for improving the accuracy of sentiment analysis algorithms,and the algorithm performance can be further increased by taking different emoji usages into consideration.In this study,we developed an improved emoji-embedding model based on Bi-LSTM(namely,CEmo-LSTM),which achieves the highest accuracy(around 0.95)when analyzing online Chinese texts.We applied the CEmo-LSTM algorithm to a large dataset collected from Weibo from December 1,2019 to March 20,2020 to understand the sentiment evolution of online users during the COVID-19 pandemic.We found that the pandemic remarkably impacted individual sentiments and caused more passive emotions(e.g.,horror and sadness).Our novel emoji-embedding algorithm creatively combined emojis as well as emoji usage with the sentiment analysis model and can handle emotion mining tasks more effectively and efficiently.
出处 《Journal of Safety Science and Resilience》 CSCD 2021年第4期246-252,共7页 安全科学与韧性(英文)
基金 the National Natural Sci-ence Foundation of China(82041020,72088101,91846301) XL ac-knowledges support from the National Natural Science Foundation of China(72025405,71771213) the Hunan Science and Technol-ogy Plan Project(2020JJ4673,2020TP1013) JL was supported by the National Natural Science Foundation of China(61773248) the Major Program of National Fund of Philosophy and Social Sci-ence of China(20ZDA060) TC and XT were supported by the Shen-zhen Basic Research Project for Development of Science and Technology(JCYJ20200109141218676).
  • 相关文献

参考文献1

二级参考文献17

  • 1CLASTER William B, HUNG Dinh, COOPER Malcolm. Naive bayes and unsupervised artificial neural nets for caneun tourism social media data analysis [C]//2nd World Congress on Nature and Biologically Inspired Computing. Kitayushu , Japan:NaBIC2, 2010:158-163.
  • 2TSENG C. Classifying twitter data with Naive bayes classifier[C]//Proceedings of 2012 IEEE International Conference on Granular Computing. Hangzhou , China: IEEE, 2012: 294-299.
  • 3REN Yong. Sentiment classification in resource-scarce languages by using label propagation [C]// Proceedings of 25th Pacific Asia Conference on Language, Information and Computation. Singapore: PACLIC, 2011 :420429.
  • 4ESCALANTE H J, MONTES- Y -GoMEZ M, SOLORIO T. A weighted profile intersection measure for profilebased authorship attribution [C ]IIProceedings of 10th Mexican International Conference on Artificial Intelligence. Puebla, Mexico: MICAI, 2011: 232-243.
  • 5JUNG JASON J. Maximum entropy-based named entity recognition method for multiple social networking services [J]. Journal of Internet Technology, 2012( 6) :931-937.
  • 6XU Ge, MENG Xinfan, WANG Houfeng. Build Chinese emotion lexicons using a graph-based algorithm and multiple resources [C]// Proceedings of the 23 rd International Conference on Computational Linguistics. Beijing, China: Association for Computational Linguistics, 201 0 : 1209-1217.
  • 7TURNEY PD. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews [C]// Proceedings of the 40th annual meeting on association for computational linguistics. Philadelphia: Association for Computational Linguistics, 2002:417424.
  • 8TURNEY P, LITTMAN M L. Unsupervised learning of semantic orientation from a Hundred-Billion-Word corpus, ERB-I094[R]. Ottawa: Institute for Information Technology, National Research Council Canada, 2002:20.
  • 9KIM S M, HOVY E. Determining the sentiment of opinions [C]// Proceedings of the 20th international conference on Computational Linguistics. Geneva: Association for Computational Linguistics, 2004: 1367 .
  • 10ESULI A, SEBASTIANI F. Determining the semantic orientation of terms through gloss classification [C]// Proceedings of the 14th ACM International Conference on Information and Knowledge Management. Bremen: ACM, 2005 :617 -624.

共引文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部