期刊文献+

IMTS:融合图像与文本语义的虚假评论检测方法 被引量:5

IMTS:Detecting Fake Reviews with Image and Text Semantics
原文传递
导出
摘要 【目的】针对网络“水军”发布的虚假评论信息在电商网站泛滥的问题,集成了一种面向中文电商网站评论的融合图像信息与文本语义的虚假评论检测方法(IMTS)。【方法】IMTS方法使用文本卷积神经网络及BERT预训练模型分别对文本评论信息进行特征提取,并得到对应的特征向量。再融入评论者特征,通过拼接评论文本语义与评论者ID的输出特征,进一步加强模型对整体语义信息的捕捉。将用户在评论中发布的图片利用残差网络进行特征抽取,获得对应的视觉特征,最后将文本特征与视觉特征进行多模态融合,检测虚假评论。【结果】IMTS方法在自建的多模态中文虚假评论数据集上,达到0.9636的准确率、0.9635的召回率以及0.9635的F1值。【局限】限于计算能力,本文数据集规模较小,且在文本处理阶段使用了BERT预训练模型,在大规模的数据计算情况下,时间成本较高。【结论】运用多模态思想以及特征融合方法对虚假评论文本进行特征补充从而检测虚假评论是有效的,此方法可以有效提升虚假评论整体的检测精度。 [Objective]This paper proposes a fake comment detection method(IMTS)integrating image information and text semantics for Chinese e-commerce websites,aiming to address the proliferation of fake comments posted by“Internet Water Army”.[Methods]First,we used the text convolutional neural network(TextCNN)and the BERT pre-training model to extract features of the text review information,and obtained the corresponding feature vectors.Then,we integrated the reviewer features to enhance the model’s capture of the overall semantic information by splicing the review text semantics and the output features of the reviewer ID.Third,we used the Residual Network(ResNet)to extract features from pictures posted by users in comments to obtain corresponding visual features.Finally,we conducted multimodal fusion of text features and visual features to detect the fake comments.[Results]The IMTS method achieved 96.36%accuracy,96.35%recall and 96.35%F1 value on the self-built multimodal Chinese fake comment dataset.[Limitations]The dataset in this paper was small in scale,and the BERT pre-training model was used in the text processing stage.[Conclusions]The proposed method could effectively improve the overall detection accuracy of fake comments.
作者 施运梅 袁博 张乐 吕学强 Shi Yunmei;Yuan Bo;Zhang Le;Lv Xueqiang(Beijing Key Laboratory of Internet Culture and Digital Dissemination Research,Beijing Information Science and Technology University,Beijing 100101,China;School of Computer Science,Beijing Information Science and Technology University,Beijing 100101,China)
出处 《数据分析与知识发现》 CSSCI CSCD 北大核心 2022年第8期84-96,共13页 Data Analysis and Knowledge Discovery
基金 国家重点研发计划基金项目(项目编号:2018YFB1004100) 国家自然科学基金项目(项目编号:62171043)的研究成果之一。
关键词 虚假评论 多模态 文本 图像 BERT False comment Multimodal Text Image BERT
  • 相关文献

参考文献8

二级参考文献30

  • 1姚天昉,娄德成.汉语语句主题语义倾向分析方法的研究[J].中文信息学报,2007,21(5):73-79. 被引量:78
  • 2Yoo K H,Gretzel U. Comparison of deceptive and truthfiil travel re-views [M] // Information and communication technologies in tourism2009. Springer Vienna, 2009 : 37 - 47.
  • 3Hu N, Bose I’ Gao Y, Liu L. Manipulation in digital word - of -mouth: A reality check for book reviews [ j] . Decision Support Sys-tems, 2011,50 (3): 627 - 635.
  • 4淘宝网.诗宝规则[EB/OL]. http: //rule, taobao. com/detail -62.htm, 2014-03-10.
  • 5新华网.首例差评师案告破[EB/OL].http: // news. sina. com.cn/o/2013 - 07 - 04/050927571266.shtml,2014 - 03-10.
  • 6Jindal N,Liu B. Opinion spam and analysis [ C] // Proceedings ofthe international conference on Web search and web data mining.ACM, 2008 : 219-230.
  • 7Ott M, Choi Y,Cardie C, Hancock J T. Finding deceptive opinionspam by any stretch of the imagination [ J ]. arXiv preprint arXiv:1107. 4557, 2011.
  • 8Algur S P,PatiJ A P, Hiremath P S, Shivashankar S. Conceptuallevel similarity measure based review spam detection [C] //Signal andImage Processing (ICSIP) , 2010 International Conference on. IEEE,2010: 416-423.
  • 9Lim E P, Nguyen V A,Jindal N, Liu B, Lauw H W. Detectingproduct review spammers using rating behaviors [C] //Proceedings ofthe 19th ACM international conference on Information and knowledgemanagement. ACM,2010 : 939 - 948.
  • 10Jindal N, Liu B, Lim E P. Finding unusual review patterns usingune^ected rules [ C] // Proceedings of the 19th ACM internationalconference on Information and knowledge management. ACM, 2010:1549 - 1552.

共引文献208

同被引文献64

引证文献5

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部