基于语义共现匹配的在线食品安全谣言相关文档识别方法研究被引量：5

Research on the Methods to Identify Online Food Safety Rumor-Related Documents Based on Semantic Co-occurrence Matching

下载PDF

导出

摘要 [目的/意义]文章通过设计一种有效的在线食品安全谣言相关文档识别方法,从而提升人工审核的效率,减轻在线食品安全谣言传播带来的不良影响。[方法/过程]基于待分类文档中的词语分布在不同类型的特征向量库(在线食品安全谣言相关库以及非相关库)中的差异性,设计了基于无监督的文档特征相似性计算方法以及基于有监督的回归方法来识别在线食品安全谣言相关文档。[结果/结论]基于无监督的RM-Sort方法能够有效地识别在线食品安全谣言相关文档,并且优于现有的朴素贝叶斯,决策树以及支持向量机方法。进一步地,基于有监督的RM-LR方法效果则更优。[局限]模型只能够判别文档是否和食品谣言相关,但无法确定该文档是辟谣文章还是谣言文章本身。 [Purpose/significance] The methods proposed by this paper can effectively detect online food safety rumor-related documents,and save a lot of review time for the administrators of social platforms to slower or even stop the online food safety rumor propagations. [Method/process] Based on the difference of the distribution of words features at different word vector spaces,including online food safety rumor correlated corpus and uncorrelated corpus,this paper designs an unsupervised documents similarity calculation method,and a supervised regression method to detect online food safety rumor-related documents. [Result/conclusion] The unsupervised RM-Sort method can effectively detect online food safety rumor-related documents,which is better than the methods of Naive Bayesian,Decision Tree,and Support Vector Machines. Moreover,the supervised RM-LR method shows the best results among all baseline methods. [Limitations] The models proposed can only identify whether the document is related to food safety rumor,but can not identify whether the document is a refuting rumor article or rumor article itself.

作者陈燕方周晓英张璐

机构地区中国人民大学信息资源管理学院

出处《情报理论与实践》 CSSCI 北大核心 2018年第6期130-136,142,共8页 Information Studies:Theory & Application

基金国家自然科学基金项目"医疗健康网站信息可信度与质量控制研究"(项目编号:71473260) 国家社会科学基金项目"健康中国建设中的国民健康促进和健康服务策略研究"(项目编号:16AZD021) 中国人民大学2017年度拔尖创新人才培育资助计划成果

关键词谣言传播食品安全词向量分布特征 rumor spreading food safety word vector distributional characteristics

分类号 G206 [文化科学—传播学]

引文网络
相关文献

参考文献1

1刘知远,张乐,涂存超,孙茂松.中文社交媒体谣言统计语义分析[J].中国科学：信息科学,2015,45(12):1536-1546. 被引量：45

二级参考文献15

1Allport G W, Postman L. The Psychology of Rumor. Oxford:Henry Holt, 1947.
2Kapferer J-N. Rumeurs:le Plus Vieux Média du Monde (in French). Paris:Le Seuil Editions, 1987.
3Peterson W A, Gist N P. Rumor and public opinion. American J Soc, 1951, 57:159-167.
4Budak C, Agrawal D, Abbadi A El. Limiting the spread of misinformation in social networks. In:Proceedings of the 20th International Conference on World Wide Web, Byderabad, 2011. 665-674.
5Castillo C, Mendoza M, Poblete B. Information credibility on twitter. In:Proceedings of the 20th International Conference on World Wide Web, Byderabad, 2011. 675-684.
6Nguyen D T, Nguyen N P, Thai M T. Sources of misinformation in online social networks:who to suspect? In:Proceedings of Military Communications Conference. Orlando:IEEE, 2012. 1-6.
7Okazaki N, Nabeshima K, Watanabe K, et al. Extracting and aggregating false information from microblogs. In:Proceedings of the Workshop on Language Processing and Crisis Information, Nagoya, 2013. 36-43.
8Qazvinian V, Rosengren E, Radev D R, et al. Rumor has it:identifying misinformation in microblogs. In:Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg:Association for Computational Linguistics, 2011. 1589-1599.
9Ratkiewicz J, Conover M, Meiss M, et al. Truthy:mapping the spread of astroturf in microblog streams. In:Proceedings of the 20th International Conference Companion on World Wide Web, Byderabad, 2011. 249-252.
10Liao Q Y, Shi L. She gets a sports car from our donation:rumor transmission in a chinese microblogging community. In:Proceedings of the 2013 Conference on Computer Supported Cooperative Work. San Antonio:ACM, 2013. 587-598.