摘要
通过互联网媒介数据构建出完整的互联网舆情指标体系,是进行舆情预测及评估、网络空间治理的基础。然而,由于数据冲突、数据不完整、计算误差、标注失误等诸多问题,严重降低某些指标的可信度。本文根据可信度高低将舆情指标划分为两类,综合多变量数据拟合、主成分分析(PCA)、多输出神经网络等技术,以及基于数据类型的指标评价方法,能够由高可信度指标推导出低可信度指标,并采用新浪微博用户数据进行性别判断实验与用户粉丝量实验。实验结果表明,所推导出的性别准确率高达96. 7%,用户粉丝量的相对绝对误差(RAE)为16%,说明本方法可以构建高可信度舆情指标体系,为舆情指标体系的构建和量化研究奠定基础。
The establishment of a complete Internet public opinion indicator system based on Internet media data is the basis for public opinion prediction and evaluation as well as cyberspace governance.However,due to data conflicts,incomplete data,calculation errors,labeling errors and other problems,the credibility of some indicators is seriously reduced.In this paper,public opinion indicators are divided into two categories according to the level of credibility.And the credibility of low reliability indicator can be improved according to the high reliability one by integrating multi-variable data fitting,principal component analysis,multi-output neural network and other technologies,as well as the indicator evaluation method based on data type.In addition,the gender judgment experiment and user followers experiment are carried out with the sinaweibo user data.The experiment results show that the gender accuracy rate is as high as 96.7%,and relative absolute error(RAE)of user followers is 16%,which indicates that the proposed method can build a highly credible public opinion indicator system and lay a foundation for the establishment and quantitative research of the public opinion indicator system.
作者
陈娟
王功明
徐翼龙
王海威
Chen Juan;Wang Gongming;Xu Yilong;Wang Haiwei(School of Journalism and Communication,Peking University,Beijing100190;Institute of Biophysics,Chinese Academy of Sciences,Beijing100101;Smart City College,Beijing Union University,Beijing 100101;Military Commission Logistics Support Information Center,Beijing 100842)
出处
《高技术通讯》
EI
CAS
北大核心
2019年第1期19-26,共8页
Chinese High Technology Letters
基金
国家自然科学基金(61502475
61841601)资助项目
关键词
舆情指标体系
可信度
指标拟合
主成分分析
多输出神经网络
public opinion indicator system
credibility
indicatorfitting
principal component analysis(PCA)
multi-output neural network