摘要
消费者在购物前往往会参考产品评论,欺骗性评论容易误导顾客使其作出错误决定。现有检测欺骗性垃圾评论的方法大多采用机器学习方法,难以学习评论的潜在语义。因此提出一个基于聚类与注意力机制的神经网络模型学习评论语义表示。该模型使用基于密度峰值的快速搜索聚类算法找出词向量空间语义群,通过KL-divergence计算权重,然后综合句子中单词与单词所属的语义群得到句子表示。实验结果表明,该模型准确率达82.2%,超过现有基准,在欺骗性垃圾评论识别中具有一定使用价值。
Consumers prefer to read product reviews before shopping.Deceptive comments can easily mislead customers to make wrong decisions.Existing methods for detecting fraudulent spam comments mostly use machine learning,but it is difficult to learn the underly ing semantics of comments.This paper proposes a neural network model based on clustering and attention mechanism to learn the se mantic representation of comments.Specifically,this paper first makes the fast search clustering algorithm based on density peaks to find the semantic group in the word vector space,and calculates the weight by KL-divergence.Then it synthesizes the words in the sen tence and the semantic group to which the word belongs to get the sentence representation.The experimental results show that the accu racy of the proposed model reaches 82.2%,which exceeds the current benchmark.Therefore,it has certain value in the identification of fraudulent spam comments.
作者
张建鑫
ZHANG Jian-xin(College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266000,China)
出处
《软件导刊》
2019年第2期34-37,共4页
Software Guide
关键词
欺骗性评论
聚类
句子加权
神经网络
deceptive review detection
clustering
sentence weighting
neural network