摘要
用户在电商平台购买商品的时候,其他用户对相关商品的评论起着重要引导作用。出于影响用户购买倾向等目的,部分商家在电商平台存在恶意刷评论的行为。已有的垃圾评论识别研究重点从用户的购买行为等方面进行数据挖掘,目前还没有研究人员从中文电商平台的垃圾评论内容角度展开研究。从国内某一知名电商平台抓取相关数据,根据行为模式确定强疑似垃圾评论;针对搜集的数据集内存在的类不平衡问题和维度灾难问题,设计出了一种两阶段垃圾评论检测方法。实证研究表明,该方法构建的模型相对于仅考虑类不平衡或仅考虑维数灾难的基准方法,具有更好的分类效果。
Relevant commodity reviews play an important role in users’purchase through e-commerce platforms;therefore,some merchants have maliciously spammed reviews on e-commerce platforms for the purpose of influencing the users’purchase predisposition.The existing spam recognition research focuses on data mining from users’purchase behaviors,leaving the room for the research of spam reviews on Chinese e-commerce platforms.This paper captures relevant data from a well-known domestic e-commerce platform to determine the strong suspected spam reviews based on users’behavior mode,and then develops a two-stage spam detection method in view of the class imbalance and the dimension disaster within data sets collected.The empirical study indicates that the model constructed by means of the proposed method shows a better effect in spam review detection compared with the fiducial approach considering only the class imbalance or the dimension disaster.
作者
曲豫宾
李芳
陈翔
QU Yubin;LI Fang;CHEN Xiang(Jiangsu College of Engineering and Technology,Nantong226007,China;Nantong University,Nantong226019,China)
出处
《江苏工程职业技术学院学报》
2017年第4期16-20,共5页
Journal of Jiangsu College of Engineering and Technology
基金
南通市分布式发电与微电网技术重点实验室项目(编号CP12015007)
江苏工程职业技术学院科研项目(编号GYKY/2016/15)
关键词
垃圾评论检测
类不平衡学习
特征选择
实证研究
spam review detection
class imbalanced learning
feature selection
empirical research