期刊文献+

基于用户行为的产品垃圾评论者检测研究 被引量:16

Research on Product Review Spammer Detection Based on Users' Behavior
下载PDF
导出
摘要 为找到垃圾评论的制造者,提出一种基于用户行为的产品垃圾评论者检测方法。从垃圾评论者的行为目的出发,将其发表垃圾评论的5种行为模式作为垃圾评论者的检测指标,从卓越亚马逊网站获取1 470个评论用户,按单指标选取、5个指标集成选取的方法确定最可能和最不可能成为垃圾评论者的评论用户各25个,并对这50个评论者进行人工标记,根据标记结果设计有监督的线性回归模型。实验结果表明,该模型从1 470个评论者中发现88个用户为垃圾评论者,对垃圾评论者的检测效果优于基于用户有用性投票的基准方法。 In order to find the review spammers,this paper proposes a user review spammer detecting method that is based on users' behavior.Starting from the purpose of review spammers,it makes spammer's five behavior patterns as the index of spammer detection.Based on the 1 470 reviewers,it gets from the JOYO Amazon website,according to a single indicator selection and 5 indicators integration selection,finally gets the 25 of most likely spammers and 25 of the most unlikely spammers,and markes artificially the 50 of suspicious reviewers,according to the artificial results trained a supervised linear regression model which is based on the 5 indicators.Experimental results show that the model of the spammer detection finds 88 review spammers in the 1 470 reviewers,and is more effective than those based on user voting usefulness of the baseline method.
出处 《计算机工程》 CAS CSCD 2012年第11期254-257,261,共5页 Computer Engineering
基金 国家自然科学基金资助项目(70971059) 辽宁省高等学校创新团队支持计划基金资助项目(2009T045) 辽宁省科技攻关计划基金资助项目(2007308003)
关键词 用户行为 线性回归模型 垃圾评论者检测 短文本 产品评论 垃圾评论 user behavior linear regression model review spammer detection short text product review review spam
  • 相关文献

参考文献14

  • 1Becchetti L, Castillo C. Donato D, et al. Link Analysis for Web Spam Detection[EB/OL]. (2010-11-21). http://www.chato.cl/ papers/becchetti_2007 link analysis web_spam_detection.pdf.
  • 2Chirita P A, Diederich J, Nejdl W. MailRank: Using Ranking for Spam Detection[C]//Proc. of the 14th ACM International Con- ference on Information and Knowledge Management. New York, USA: [s. n.], 2005.
  • 3Benevenuto F, Magno G, Rodrigues T, et al. Detecting Spammers on Twitter[C]//Proc. of Anti-abuse and Spam Conference on Collaboration, Electronic Messaging. [S. 1.]: IEEE Press, 2010.
  • 4何海江.一种适应短文本的相关测度及其应用[J].计算机工程,2009,35(6):88-90. 被引量:7
  • 5孙升芸,田萱.产品垃圾评论检测研究综述[J].计算机科学,2011,38(B10):198-201. 被引量:13
  • 6何苑,谭红叶.基于多结构特征的垃圾博客识别研究[J].计算机工程与设计,2010,31(22):4932-4935. 被引量:6
  • 7Gilbert E, Karahalios K. Understanding Deja Reviewers[C]//Proc. of ACM Conference on Computer Supported Cooperative Work. New York, USA: [s. n.], 2010.
  • 8Dellarocas C. Immunizing Online Reputation Reporting Systems Against Unfair Ratings and Discriminatory Behavior[C]//Proc. of the 2nd ACM Conference on Electronic Commerce. New York, USA: [s. n.], 2000.
  • 9Wu Guangyu, Greene D, Smyth B, et al. Distortion As a Validation Criterion in the Identification of Suspicious Reviews[EB/OL]. (2010-11-21). http://www.csi.ucd.ie/content/distortion-validation- criterion-identification-suspicious-reviews.
  • 10Mizil C D N, Kossinets G, Kleinberg J, et al. How Opinions are Received by Online Communities: A Case Study on Amazon.corn Helpfulness Votes[C]//Proc. of the 18th International Conference on World Wide Web. New York, USA: [s. n.], 2009.

二级参考文献41

  • 1Brooks C H, Montanez N. Improved Annotation of the Blogosphere via Autotagging and Hierarchical Clustering[C]//Proc. of the 15th International Conference on World Wide Web. New York, USA: ACM Press, 2006: 625-632.
  • 2Kolari E Detecting Spam Blogs: A Machine Learning Approach[C]//Proc. of the 21st National Conference on Artificial Intelligence. Maryland, USA: [s. n.], 2006: 1351-1356.
  • 3Niu Yuan. A Quantitative Study of Forum Spamming Using Context-based Analysis[C]//Proc. of the 14th Annual Network and Distributed System Security Symposium. San Diego, CA, USA: [s. n.], 2007: 79-92.
  • 4Hoad T, Zobel J. Methods for Identifying Versioned and Plagiarised Documents[J]. Journal of the American Society of Information Science and Technology, 2003, 54(3): 203-215.
  • 5中国互联网协会.中国互联网协会反垃圾邮件规范[EB/OLl.2003-02-26.http://www.isc.org.cn/20020417/cal34119.htm.
  • 6Becchetti L, Castillo C, Donato D, et al. Link analysis for Web spare detection [J]. ACM Trans Web, 2008,2 (1) : 1-42.
  • 7Cortezp P,Correia A, Sousa P, et al. Spam email filtering using network-level properties I-C]//Proceedings of the 10th industrial conference on Advances in data mining:applications and theoretical aspects. Berlin, Germany.. Springer-Verlag, 2010 :476-489.
  • 8Ghose A, Ipeirotis P G. Designing novel review ranking systems: predicting the usefulness and impact of reviews [C]//Proceedings of the ninth international conference on electronic commere. Minneapolis, MN, USA: ACM, 2007 : 303-310.
  • 9Liu J, Cao Y, Lin C-Y, et al. Low-Quality Product Review Detection in Opinion Summarization [C] // Proceedings of the Joint Conference on Empirical Methods in Natural Language and Computational Natural Language Learning. Prague, 2007:334-342.
  • 10Kim S-M, Pantel P, Chklovskit T, et al. Automatically assessing review helpfulness [C]//Proeeedings of the 2006 Conference on Empirieal Methods in Natural Language Processing. Sydney, Australia; Association for Computational Linguistics, 2006 : 423- 430.

共引文献22

同被引文献180

  • 1顾益军,樊孝忠,王建华,汪涛,黄维金.中文停用词表的自动选取[J].北京理工大学学报,2005,25(4):337-340. 被引量:35
  • 2王斌,潘文锋.基于内容的垃圾邮件过滤技术综述[J].中文信息学报,2005,19(5):1-10. 被引量:129
  • 3蒋涛,张彬.Web Spam技术研究综述[J].情报探索,2007(7):66-68. 被引量:3
  • 4赵文婧.产品描述词及情感词抽取模式的研究[D].北京邮电大学,2010.
  • 5巾国互联网信息中心.第32次中国互联网络发展状况统计报告[R/OL].[2013-09-30].http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201307/t20130717_40664.htm.
  • 6Wang G, Xie S H, Liu B, et al. Review Graph Based Online Store Review Spammer Detection [C]. In: Proceedings of the 11th International Conference on Data Mining. Washington, DC, USA: IEEE Computer Society, 2011 : 1242-1247.
  • 7Li F T, Huang M, Yang Y, et al. Learning to Identify Review Spam [C]. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. AAAI Press, 2011: 2488-2493.
  • 8Ott M, Choi Y J, Cardie C, et al. Finding Deceptive Opinion Spare by Any Stretch of the Imagination [C]. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA, USA: Association tbr Computational Linguistics, 2011 : 309-319.
  • 9Jindal N, Liu B. Review Spam Detection [C]. In: Proceedings of the 16th International Conference on World Wide Web. New York, NY, USA: ACM, 2007:1189-1190.
  • 10Jindal N, Liu B. Analyzing and Detecting Review Spain [C]. In: Proceedings of the 7th International Conference on Data Mining. Washington, DC, USA: IEEE Computer Society, 2007: 547-552.

引证文献16

二级引证文献109

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部