摘要
社交媒体网站是人们在日常生活中分享信息、表达和交换意见的便捷平台。随着用户数量的不断增加,社交媒体网站上出现了大量的信息数据。然而,由于用户没有检查共享信息的可靠性,这些信息的真实性难以保证,从而导致大量虚假信息在社交媒体上广泛传播。然而,现有方法大多存在以下局限性:1)大多数方法通过简单提取文本与视觉特征,将其拼接后得到多模态特征来进行虚假信息判断,忽略了模态间和模态内细粒度内在联系,缺乏对关键信息的检索和筛选;2)多模态信息间缺乏指导性的特征提取,文本和视觉等特征之间缺乏交互增强,对多模态信息的理解不足。为了应对这些挑战,提出了一种新颖的基于多模态双协同Gather Transformer网络(Multimodal Dual-Collaborative Gather Transformer Network,MDCGTN)的虚假信息检测方法。在MDCGTN模型中,通过文本-视觉编码网络对文本和视觉信息的特征表示进行提取,将获得的视觉和文本特征表示输入多模态Gather Transformer网络进行多模态信息融合,使用Gather机制提取关键信息,充分捕捉和融合模态内和模态间细粒度关系。此外,设计了一个双协同机制对社交媒体帖子的多模态信息进行整合,以实现模态之间信息的交互和增强。在两个公开可用的基准数据集上进行了大量实验,结果表明,与现有的先进基准方法相比,所提方法准确率明显提升,证明了其对于虚假信息检测的优越性能。
Social media platforms are convenient platforms for people to share information,express opinions,and exchange ideas in their daily lives.With the increasing number of users,a large amount of data has emerged on social media websites.However,the authenticity of the shared information is difficult to be guaranteed due to users’lack of verification.This situation has led to the widespread dissemination of a large amount of fake news on social media.However,existing methods suffer from the follo-wing limitations:1)Most existing methods rely on simple text and visual feature extraction,concatenating them to obtain multimodal features for detecting fake news,while ignoring the fine-grained intrinsic connections within and between modalities,and lacking retrieval and filtering of key information.2)There is a lack of guided feature extraction among multimodal information,with insufficient interaction and understanding between textual and visual features.To address these challenges,a novel multimodal dual-collaborative gather transformer network(MDCGTN)is proposed to overcome the limitations of existing methods.In the MDCGTN model,textual and visual features are extracted using a text-visual encoding network,and the obtained features are input into a multimodal gather transformer network for multimodal information fusion.The gathering mechanism is used to extract key information,fully capturing and fusing fine-grained relationships within and between modalities.In addition,a dual-collaborative mechanism is designed to integrate multimodal information in social media posts,enhancing interaction and understanding between modalities.Extensive experiments are conducted on two publicly available benchmark datasets.Compared to existing state-of-the-art benchmark methods,the proposed MDCGTN method achieves significant improvement in accuracy,demonstrating its superior performance in detecting fake news.
作者
向旺
王金光
王一飞
钱胜胜
XIANG Wang;WANG Jinguang;WANG Yifei;QIAN Shengsheng(Henan Institute of Advanced Technology,Zhengzhou University,Zhengzhou 450000,China;School of Computer Science and Information Engineering,Hefei University of Technology,Hefei 230601,China;State Key Laboratory of Multimodal Artificial Intelligence Systems,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China)
出处
《计算机科学》
CSCD
北大核心
2024年第12期242-249,共8页
Computer Science
基金
国家自然科学基金(62276257)。