摘要
为深入了解2020年新冠肺炎疫情期间我国新浪微博用户的讨论焦点和情绪走向,针对用户发布在疫情相关话题下的文本展开研究.首先,进行数据预处理,对部分文本进行人工标注,并利用TextRank算法对已标注的文本进行关键词提取.其次,运用支持向量机、逻辑回归、随机森林3种传统机器学习方法和卷积神经网络、双向长短期记忆网络2种神经网络方法分别进行文本情感分类,并对比分类效果.最后,根据最优分类算法的标注结果进行用户情感指数变化分析,并结合统计过程控制设计微博舆情监测控制图.
In order to grasp the focus and emotional trends of domestic users of Sina-Weibo during COVID-19 outbreak,the research on the topic related to the epidemic is discussed.The first step is data preprocessing,then part of the texts are annotated manually.TextRank algorithm is used to extract keywords from the labeled texts.After that,three traditional machine learning methods,including support vector machine,logistic regression,and random forest,and two neural network methods,including convolutional neural network in text and bi-directional long-short term memory network,are utilized for text sentiment analysis,in the meanwhile,the classification effects are compared.In the end,the change in user's sentiment index is analysed based on the annotation results of the best algorithm,and control chart is designed based on statistical process control.
作者
顾彩慧
周勤
Gu Caihui;Zhou Qin(School of Mathematics&Statistics,Jiangsu Normal University,Xuzhou 221116,Jiangsu,China;School of Mathematics&Physics,Yancheng Institute of Technology,Yancheng 224000,Jiangsu,China)
出处
《江苏师范大学学报(自然科学版)》
CAS
2021年第4期41-45,共5页
Journal of Jiangsu Normal University:Natural Science Edition
基金
国家自然科学基金资助项目(11671178)。