Multi-Class Sentiment Analysis of Social Media Data with Machine Learning Algorithms

下载PDF

导出

摘要 The volume of social media data on the Internet is constantly growing.This has created a substantial research field for data analysts.The diversity of articles,posts,and comments on news websites and social networks astonishes imagination.Nevertheless,most researchers focus on posts on Twitter that have a specific format and length restriction.The majority of them are written in the English language.As relatively few works have paid attention to sentiment analysis in the Russian and Kazakh languages,this article thoroughly analyzes news posts in the Kazakhstan media space.The amassed datasets include texts labeled according to three sentiment classes:positive,negative,and neutral.The datasets are highly imbalanced,with a significant predominance of the positive class.Three resampling techniques(undersampling,oversampling,and synthetic minority oversampling(SMOTE))are used to resample the datasets to deal with this issue.Subsequently,the texts are vectorized with the TF-IDF metric and classified with seven machine learning(ML)algorithms:naïve Bayes,support vector machine,logistic regression,k-nearest neighbors,decision tree,random forest,and XGBoost.Experimental results reveal that oversampling and SMOTE with logistic regression,decision tree,and random forest achieve the best classification scores.These models are effectively employed in the developed social analytics platform.

作者 Galimkair Mutanov Vladislav Karyukin Zhanl Mamykova

机构地区 Al-Farabi Kazakh National University

出处《Computers, Materials & Continua》 SCIE EI 2021年第10期913-930,共18页 计算机、材料和连续体（英文）

关键词 Social media sentiment analysis imbalanced classes machine learning OVERSAMPLING UNDERSAMPLING SMOTE RUSSIAN KAZAKH

分类号 TP1 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献2

1Feng Xu,Xuefen Zhang,Zhanhong Xin,Alan Yang.Investigation on the Chinese Text Sentiment Analysis Based on Convolutional Neural Networks in Deep Learning[J].Computers, Materials & Continua,2019(3):697-709. 被引量：12
2Peng Cen,Kexin Zhang,Desheng Zheng.Sentiment Analysis Using Deep Learning Approach[J].Journal on Artificial Intelligence,2020,2(1):17-27. 被引量：2

二级参考文献3

1周练.Word2vec的工作原理及应用探究[J].科技情报开发与经济,2015,28(2):145-148. 被引量：101
2陈自岩,黄宇,王洋,傅兴玉,付琨.一种利用语义相似特征提升细粒度情感分析方法[J].计算机应用与软件,2017,34(3):27-30. 被引量：4
3Cosmin Anitescu,Elena Atroshchenko,Naif Alajlan,Timon Rabczuk.Artificial Neural Network Methods for the Solution of Second Order Boundary Value Problems[J].Computers, Materials & Continua,2019(4):345-359. 被引量：105

共引文献12

1田煜.基于语义情感分析的网络热点爬虫舆情分析系统[J].软件,2020,41(8):89-93. 被引量：7
2Anman Zhang,Bohan Li,Wenhuan Wang,Shuo Wan,Weitong Chen.MII:A Novel Text Classification Model Combining Deep Active Learning with BERT[J].Computers, Materials & Continua,2020(6):1499-1514. 被引量：6
3Jialin Ma,Jieyi Cheng,Lin Zhang,Lei Zhou,Bolun Chen.A Phrase Topic Model Based on Distributed Representation[J].Computers, Materials & Continua,2020(7):455-469.
4Shu Fang,Lei Huang,Yi Wan,Weize Sun,Jingxin Xu.Outlier Detection for Water Supply Data Based on Joint Auto-Encoder[J].Computers, Materials & Continua,2020(7):541-555. 被引量：2
5Sang-Min Park,Young-Gab Kim.User Profile System Based on Sentiment Analysis for Mobile Edge Computing[J].Computers, Materials & Continua,2020(2):569-590. 被引量：1
6Shuai Yuan,Tingting He,Huan Huang,Rui Hou,Meng Wang.Automated Chinese Essay Scoring Based on Deep Learning[J].Computers, Materials & Continua,2020(10):817-833. 被引量：1
7Dejia Shi,Hanzhong Zheng.A Mortality Risk Assessment Approach on ICU Patients Clinical Medication Events Using Deep Learning[J].Computer Modeling in Engineering & Sciences,2021(7):161-181. 被引量：1
8Hanyu Shi,Weiguang Qu,Tingxin Wei,Junsheng Zhou,Yunfei Long,Yanhui Gu,Bin Li.Hybrid Neural Network for Automatic Recovery of Elliptical Chinese Quantity Noun Phrases[J].Computers, Materials & Continua,2021(12):4113-4127.
9Abdullah Muhammad,Salwani Abdullah,Nor Samsiah Sani.Optimization of Sentiment Analysis Using Teaching-Learning Based Algorithm[J].Computers, Materials & Continua,2021(11):1783-1799.
10Huchen Zhou,Wenfeng Huang,Mohan Li,Yulin Lai.Relation-Aware Entity Matching Using Sentence-BERT[J].Computers, Materials & Continua,2022(4):1581-1595. 被引量：1

1Rodrigo Morgon,Silvio do Lago Pereira.Evolutionary Learning of Concepts[J].Journal of Computer and Communications,2014,2(8):76-86.
2方方,王昕.基于集成学习的不平衡交通事故风险研究[J].北京信息科技大学学报（自然科学版）,2021,36(6):19-24. 被引量：1
3WORLD NEWS[J].ChinAfrica,2022,14(7):10-11.
4Wanjun Wu.A Sentiment Analysis Approach to Discover Public Panic: Based on Weibo Covid-19 Data[J].Social Networking,2022,11(3):33-39.
5Huixuan Xu,Chunlai Du,Yanhui Guo,Zhijian Cui,Haibo Bai.A Generation Method of Letter-Level Adversarial Samples[J].Journal on Artificial Intelligence,2021,3(2):45-53.
6Lian Qin,Rena Rehemuding,Aikeliyaer Ainiwaer,Xiang Ma.Correlation between betatrophin/angiogenin-likeprotein3/lipoprotein lipase pathway and severity of coronary artery disease in Kazakh patients with coronary heart disease[J].World Journal of Clinical Cases,2022,10(7):2095-2105. 被引量：1
7Oluwabunmi Opeyemi Oyebode,Saheed Omotayo Okesola.#Take Responsibility:Non-Verbal Modes as Discursive Strategies in Managing Covid-19 Public Health Crisis[J].Language and Semiotic Studies,2020,6(4):1-24.
8Najafi Hossein,Darryl W.Miller.Predicting motion picture box office performance using temporal tweet patterns[J].International Journal of Intelligent Computing and Cybernetics,2018,11(1):64-80. 被引量：1
9张克,张文俊,朱蕴文,邢毅雪.基于内联关系的方面级情感分析方法[J].上海大学学报（自然科学版）,2022,28(1):157-169.
10Mingyue Cheng,Xueling Ge,Chaofang Zhong,Ruiqing Fu,Kang Ning,Shuhua Xu.Micro-coevolution of host genetics with gut microbiome in three Chinese ethnic groups[J].Journal of Genetics and Genomics,2021,48(11):972-983. 被引量：1

Computers, Materials & Continua

2021年第10期

浏览历史

内容加载中请稍等...

Multi-Class Sentiment Analysis of Social Media Data with Machine Learning Algorithms

参考文献2

二级参考文献3

共引文献12

相关作者

相关机构

相关主题

浏览历史