期刊文献+

基于数据增强和多任务学习的突发公共卫生事件谣言识别研究 被引量:1

Rumor Detection of Public Health Emergencies Based on Data Augmentation and Multi-Task Learning
原文传递
导出
摘要 【目的】通过引入多任务学习模型和数据增强方法,解决突发公共卫生事件情景下谣言识别任务数据不平衡且带标签数据量少的问题。【方法】首先提取突发公共卫生事件谣言文本特征构建替换词表,基于扩展同义词表构建CEDA方法对不平衡的谣言数据集进行增强,然后构建多任务学习模型融合突发公共卫生事件情感分类和谣言识别任务的领域信息,基于Transformer获取共享特征,通过BiLSTM模型获取谣言识别任务的独有特征,提升突发公共卫生事件谣言识别任务准确性。【结果】本文所提多任务学习模型的F1值达到0.972,比基于不平衡数据集的模型和单任务学习模型分别高出0.006和0.007,与DC-CNN模型相比F1值提升0.024。【局限】多任务学习模型的辅助任务仅包括情感二分类任务,需要对负面情感进行更细粒度的分类。【结论】基于领域数据增强和多任务学习的方法能够有效提高突发公共卫生事件谣言识别的分类效果。 [Objective]This paper proposes a new model with data augmentation and multi-task learning,aiming to address the issue of unbalanced data and insufficient labeled data in rumor detection during public health emergencies.[Methods]Firstly,we extracted the text features of public health emergency rumors to construct a replacement word list.Then,we developed the CEDA method based on the extended synonym table to enhance the unbalanced rumor dataset.Third,we built a multi-task learning model to integrate the domain information of public health emergency sentiment classification and rumor detection.Fourth,we obtained the shared features with Transformer and retrieved the unique features of the rumor detection task using the BiLSTM model.Finally,it helped us improve the accuracy of the rumor detection.[Results]The F1 value of the proposed model was 0.972,which was 0.006 and 0.007 higher than the model based on the unbalanced dataset and the single-task learning model.Compared with the DC-CNN model,the F1 value increased by 0.024.[Limitations]The multi-task learning model only includes binary classification of sentiments,requiring more fine-grained negative sentiment classification.[Conclusions]The proposed method can effectively classify public health emergency rumors.
作者 曾子明 张瑜 Zeng Ziming;Zhang Yu(School of Information Management,Wuhan University,Wuhan 430072,China)
出处 《数据分析与知识发现》 EI CSCD 北大核心 2023年第11期56-67,共12页 Data Analysis and Knowledge Discovery
基金 国家社会科学基金项目(项目编号:21BTQ046)的研究成果之一。
关键词 突发公共卫生事件 谣言识别 数据增强 多任务学习 共享Transformer Public Health Emergencies Rumor Detection Data Augmentation Multi-Task Learning Shared Transformer
  • 相关文献

参考文献12

二级参考文献81

  • 1余本功,曹雨蒙,陈杨楠,杨颖.基于nLD-SVM-RF的短文本分类研究[J].数据分析与知识发现,2020,4(1):111-120. 被引量:10
  • 2夏松,林荣蓉,刘勘.网络谣言敏感词库的构建研究——以新浪微博谣言为例[J].知识管理论坛,2019(5):267-275. 被引量:6
  • 3任一奇,王雅蕾,王国华,冯伟.微博谣言的演化机理研究[J].情报杂志,2012,31(5):50-54. 被引量:40
  • 4胡钰.大众传播效果[M]{H}北京:新华出版社,2000120-121.
  • 5Castillo C,Mendoza M,Poblete B. Information credibility on Twitter[A].New York:ACL,2011.675-684.
  • 6Qazvinian V,Rosengren E,Radev D R. Rumor has it:Identifying misinformation in microblogs[A].Edinburgh:ACL,2011.1589-1599.
  • 7Mendoza M,Pdblete B,Castillo C. Twitter under crisis:Can we trust what we RT[A].New York:ACL,2010.71-79.
  • 8Takahashi T,Igata N. Rumor detection on Twitter[A].Kobe:IEEE,2012.452-457.
  • 9Yang Fan,Liu Y,Yu X. Automatic detection of rumor on Sina Weibo[A].Beijing:ACM,2012.1-7.
  • 10Wang A H. Don't follow me:Spam detection in Twitter[A].Athens:SciThePress,2010.142-151.

共引文献172

同被引文献26

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部