摘要
目前识别虚假评论的方法主要基于评论内容的文本特征和评论者的行为特征,然而评论文本与评论者行为容易被伪造和模仿,且这两类方法只能对虚假评论逐个识别,本文考虑了虚假评论的网络结构特征,通过分析评论者的网络行为及评论者节点间的网络结构特征定义相邻节点多样性与自相似性,利用累积分布函数估计其概率并合成网络行为得分,以得分高的可疑产品为种子建立2-hop子图,筛选子图中高度相似的虚假评论候选群组,利用GroupStrainer、HDBSCAN等算法对其进行聚类合并,以发现隐藏的虚假评论群组。以亚马逊四类最畅销的产品数据集为样本进行实证分析的结果表明,文中提出的方法能够有效识别隐藏较深的大规模虚假评论群组,综合群组内容的统计特征分析发现,虚假评论群组对目标产品的攻击模式存在产品类别差异,虚假评论群组比真实评论者对目标产品具有更强的集中度,但同时也会利用其它非目标产品对自身进行伪装以弱化其可疑性。
At present, the methods of identifying fake reviews are mainly based on the text characteristics of the review and the behavior characteristics of the reviewer. However, the review text and the behavior of the reviewer are easy to be forged and imitated, what’s more, these two types of methods can only identify fake reviews one by one. This paper considers the network structure characteristics of fake reviews, defines the diversity and self-similarity of neighboring nodes by analyzing the network behavior of reviewers and the network structure characteristics between reviewer nodes, estimates the probability using a cumulative distribution function, and synthesizes network behavior scores. A 2-hop sub-graph is created by using suspicious products with high scores as seeds. After screening candidate groups of highly similar fake reviews in the sub-graphs, it is clustered and merged using GroupStrainer algorithms, HDBSCAN method and so on to find hidden fake review groups. The result of empirical analysis, which using Amazon’s four best-selling product data setsas samples, show that the method proposed in the article can effectively identify large-scale hidden fake review group. The statistical analysis of comprehensive group content found that there are product category differences in the attack mode of the fake review group on the target product. The fake review group has a stronger concentration of the target product than real reviewers, but they also use other non-target products to disguise itself to weaken its suspiciousness.
作者
魏瑾瑞
王若彤
王晗
WEI Jinrui;WANG Ruotong;WANG Han(School of Statistics,Dongbei University of Finance&Economics,Dalian 116025,China;School of Statistics,Beijing Normal University,Beijing 100000,China)
出处
《运筹与管理》
CSSCI
CSCD
北大核心
2023年第1期194-200,共7页
Operations Research and Management Science
基金
辽宁省社科基金资助项目(L20BTJ003)。
关键词
评论网络结构
虚假评论群组
网络行为得分
review network structure
fake review group
network behavior score