摘要
如何在互联网海量信息中预测话题风险性和演化趋势,是舆情监管部门的工作重点。针对话题演化趋势预测研究中存在的不足:话题状态划分方法单一、话题状态演化预测研究缺乏等,从话题预警的视角,提出话题风险状态预测方法,为舆情监管部门提供预警依据。首先,基于向心度和密度指标划分不同等级的话题风险状态,直观地刻画话题引发舆论危机的风险程度;其次,基于HMM(Hidden Markov Model,隐马尔可夫模型)对各话题风险状态构建模型,并将各风险状态下所对应的观测序列数据作为训练集训练模型;最后,根据极大相似准则选用最佳模型预测话题观测值,进而借助平面坐标映射法得到话题在未来时刻的风险状态。以新冠肺炎疫情事件为研究样本话题,验证基于HMM的话题风险状态预测方法的有效性,交叉检验的平均预测准确率达到90%以上,相比于BP神经网络、LSTM以及RNN时间序列预测模型,该方法的预测误差更小。
Predicting the riskiness and evolutionary trend of topics in the massive information on the Internet is work priority of public opinion supervisory departments.In view of the problems in the current analysis of topics,such as the single method of topic state classification and the lack of research on topic state evolution prediction,a prediction method of topic risk states was proposed from the perspective of topic early warning,aiming to provide the basis for public opinion supervision departments.First,the method classified different levels of topic risk states based on two indexes of centrality and density,in order to visually portray the risk level of topics triggering public opinion crisis.Then,the model was built for each risk state based on the hidden Markov model,and the observation sequence data corresponding to each risk state was used as the training set to train the model.Finally,the best model was selected to predict the observed value of the topic according to the maximum likelihood principle,so as to obtain the risk state of the topic at the future moment with the help of the plane coordinate mapping method.The COVID-19 epidemic event was taken as the research sample topic to verify the effectiveness of the topic risk state prediction method.The average prediction accuracy of cross validation is over 90%.In addition,the proposed method has lower prediction error than the time series prediction models of BP neural network,LSTM and RNN.
作者
蔡婷婷
朱恒民
魏静
CAI Ting-ting;ZHU Heng-min;WEI Jing(School of Management,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;Jiangsu University Philosophy and Social Science Key Research Base—Information Industry Integration Innovation and Emergency Management Research Center,Nanjing 210003,China)
出处
《计算机技术与发展》
2023年第5期29-34,共6页
Computer Technology and Development
基金
国家自然科学基金项目资助项目(71874088,71704085)
江苏省研究生科研与实践创新计划项目资助项目(KYCX21_0835)。