期刊文献+

支持跨领域的中文虚假评论识别方法

Support for Cross-Domain Methods of Identifying Fake Comments of Chinese
原文传递
导出
摘要 【目的】在多领域数据集的基础上,构建一种基于评论文本深层词关系语义信息提取的支持跨领域的中文虚假评论识别模型CFEE,解决传统识别方法较少考虑中文评论文本中存在不同领域数据差异性和领域虚假评论数据隐藏性的问题。【方法】提出11条虚假评论数据集建立规则,建立多领域数据集;构建CFEE模型跨领域识别中文虚假评论,其主要功能为基于ERNIE预训练模型提取文本深层语义信息、基于评论文本情感属性识别评论隐藏性、基于卷积神经网络将文本信息投射到词关系维度、基于神经网络融合特征实现分类。【结果】CFEE模型在多领域中文虚假评论数据集上的F1值为91.52%,在手机、食品、服装、家电等单领域数据集上的F1值分别为85.71%、79.59%、85.71%、85.00%,效果均显著优于现有模型。【局限】存在人工标注的主观性。【结论】本文所提识别方法能够有效地跨领域识别中文虚假评论。 [Objective]This paper constructs a cross-domain Chinese fake review identification model(CFEE)for multi-domain datasets.It extracts the semantic information of the comment texts and addresses the problems of traditional recognition models.[Methods]First,we established 11 rules for constructing fake review datasets and created a multi-domain dataset.Then,we designed the CFEE model to identify Chinese fake comments across domains.Third,it extracted the deep semantic information with the ERNIE pre-training model.The model identified the hidden comments based on the texts’emotional attributes.Finally,it projected the text information to the word relation dimension with the convolutional neural network and realized classification based on features of neural network fusion.[Results]The CFEE model’s F1 value reached 91.52%on the multi-domain Chinese fake comment datasets.The model’s F1 values were 85.71%,79.59%,85.71%,and 85.00%on single-domain datasets for mobile phones,food,clothing,and household appliances,respectively.It outperformed the existing models significantly.[Limitations]There is subjectivity in the manual annotation.[Conclusions]The proposed method can effectively identify Chinese fake reviews across domains.
作者 谷岩 郑楷洪 胡勇军 宋益善 刘东屏 Gu Yan;Zheng Kaihong;Hu Yongjun;Song Yishan;Liu Dongping(School of Management,Guangzhou University,Guangzhou 510006,China;School of Data Science,The Chinese University of Hong Kong,Shenzhen 518000,China;Partner&Business Enabling,Amazon Web Services GCR,Beijing 100015,China)
出处 《数据分析与知识发现》 EI CSCD 北大核心 2024年第2期84-98,共15页 Data Analysis and Knowledge Discovery
基金 国家社会科学基金项目(项目编号:18BGL236) 国家重点研发计划(项目编号:2021YFB3301801) 教育部第二期供需对接就业育人项目重点领域校企合作项目(项目编号:20230103480)的研究成果之一。
关键词 虚假评论 ERNIE模型 跨领域识别 中文语义 情感得分 Fake Comments ERNIE Model Cross-Domain Identification Chinese Semantic Emotional Score
  • 相关文献

参考文献21

二级参考文献211

  • 1王斌,潘文锋.基于内容的垃圾邮件过滤技术综述[J].中文信息学报,2005,19(5):1-10. 被引量:129
  • 2张黎,范亭亭,王文博.降价表述方式与消费者感知的降价幅度和购买意愿[J].南开管理评论,2007,10(3):19-28. 被引量:37
  • 3蒋涛,张彬.Web Spam技术研究综述[J].情报探索,2007(7):66-68. 被引量:3
  • 4Jindal N, I.iu B. Review spare detection. Proceedings of the 16-th International Conference on World Wide Web,2007:1189-1190.
  • 5谭文堂,朱洪,葛斌等.垃圾评论自动过滤方法.同防科技大学学报,2012,34(5):153-157.
  • 6Feng S,Banerjee R,Chai Y J. Syntactic stylometry for deception detection. Proceedings of the 50^th Annual Meeting of the Association for Oomputational I.inguistics, 2012 : 8- 14.
  • 7Jindal N, Liu B, Lim E P. Finding unusual review patterns using unexpected rules. Proceedings of the 19^th ACM International Conference on Information and Knowledge Management. 2010 : 1549- 1552.
  • 8Lira E P,Nguyen V A,Jindal N,et ag. Detecting product review spammers using rating behaviors. Proceedings of the 19^th ACM International Con{erence on Information and Knowledge Man agement, New York, USA : 2010.
  • 9Wang G, Xie S H, Liu B, et al. Identify online store review spammers via social review graph. ACM Transactions on Intelligent Systems and Technology(TIST) ,2012,3(4).
  • 10Xie S H, Wang G, Lin S Y, et al. Review spam detection via temporal pattern discovery. Proceedings of the 18^th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2012: 823-831.

共引文献176

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部