摘要
网络舆情具有时效性强、传播迅速、涉及方面杂而广、意见指向性特征明显、泛娱化特征明显等特点。因此,提出对LDA输入数据采用TF-IDF算法加强特征词筛选的方法。选取"巴黎圣母院大火"事件,采集作为网络舆情重要来源的微博数据,进行LDA建模,引入TF-IDF算法进行特征词的筛选,能较准确地分析出该事件的主题分布。
Network public opinion has the characteristics of strong timeliness, rapid dissemination, wide and miscellaneous involved aspect, obvious directional feature of opinion, and obvious characteristics of pan-entertainment. Therefore, TF-IDF algorithm is proposed to enhance the feature word filtering for LDA input data. Select "Notre Dame Fire in Paris" event, collect micro-blog data, which is an important source of public opinion on the network, model it with LDA, and introduce TF-IDF algorithm to select the feature words, it is found that the subject distribution of the event can be analyzed more accurately.
作者
程小刚
安梦佳
郭韧
Cheng Xiaogang;An Mengjia;Guo Ren(College of Computer Sciences and Technology,Huaqiao University,Xiamen,Fujian 361021,China;College of Business Administration,Huaqiao University)
出处
《计算机时代》
2020年第5期30-33,37,共5页
Computer Era
基金
福建省中国特色社会主义理论体系研究中心项目(2019ZTD24)
泉州市社会科学规划项目(2019D03)
华侨大学实验项目(Z17X0143)。