期刊文献+

一种情感分析与质量控制的异常评论识别方法

A spammer detection method based on sentiment analysis and quality control on comments
下载PDF
导出
摘要 针对因数据量的增加以及异常评论策略的更新,以用户内容和行为为基础的传统微博异常评论识别方法效果不断下降的问题,提出一种基于情感分析和质量控制的微博异常评论识别方法.通过将预处理后的微博评论进行情感分析,将微博评论进行量化处理,在对微博评论进行质量控制的过程中,根据异常与正常用户在时域上对热点微博的评论分布差别检测可疑时间间隔,结合用户聚类分析,设计了异常评论识别模型.结果表明:该方法利用情感评分,对于评论文本进行较为准确的情感分类,然后通过调整边界值范围和时间阈值范围来限定异常检测等级,当边界值范围增大时,对于异常评论的检测范围扩大,容忍度下降,检测灵敏度高;当时间阈值扩大时,容忍度提高,检测灵敏度较低;适当的选择边界值和时间阈值,可以有效提高与正常评论行为相似的异常评论识别准确率. To avoid the poor effect of spammer detection in traditional methods based on content and user behavior due to the increase of data and spammers’updated strategy,a spammer detection method based on sentiment analysis and quality control in microblog was proposed.In the method,the pre-processed comments in microblog were quantified by sentiment analysis.Then in the process of quality control of microblog commentary,the suspicious interval was detected according to different distribution between spammers and normal users for hotspot in varying time.Then a model for spammer recognition was established by cluster analysis.The experimental results showed that the method used the emotion score to make the emotion classification more accurately for the comment text,and then adjusted the boundary value range and time threshold range to limit the anomaly detection level.When the boundary value range increased,the anomaly detection range of the abnormal comment was increased,the tolerance was reduced and the detection sensitivity was improved.When the time threshold was expanded,the tolerance was improved and the detection sensitivity was reduced.Therefore,the appropriate choice of the boundary value and time threshold can effectively improve the accuracy of anomaly recognition which is similar to the normal commentary behavior.
作者 张瑞 金志刚 胡博宏 张子洋 ZHANG Rui;JIN Zhigang;HU Bohong;ZHANG Ziyang(School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China;Department of Software and Communication,Tianjin Sino-German University of Applied Sciences,Tianjin 300350,China)
出处 《哈尔滨工业大学学报》 EI CAS CSCD 北大核心 2018年第9期164-170,共7页 Journal of Harbin Institute of Technology
基金 国家自然科学基金(71502125)
关键词 情感分析 质量控制 微博评论 异常检测 时间阈值 识别方法 sentiment analysis quality control microblog commentary spammer detection time threshold detection method
  • 相关文献

参考文献8

二级参考文献113

  • 1张珊,于留宝,胡长军.基于表情图片与情感词的中文微博情感分析[J].计算机科学,2012,39(S3):146-148. 被引量:55
  • 2韩忠明,许峰敏,段大高.面向微博的概率图水军识别模型[J].计算机研究与发展,2013,50(S2):180-186. 被引量:10
  • 3林传鼎,无.社会主义心理学中的情绪问题——在中国社会心理学研究会成立大会上的报告(摘要)[J].社会心理科学,2006,21(1):37-37. 被引量:15
  • 4姚天昉,聂青阳,李建超,李林琳,陈柯,付宁.一个用于汉语汽车评论的意见挖掘系统[C]//中文信息处理前沿进展-中国中文信息学会二十五周年学术会议论文集.北京:清华大学出版社,2006:260-281.
  • 5Hong Yu, Vasileios Hatzivassiloglou. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences [C]//Proceedings of EMNLP 2003,2003: 129-136.
  • 6Ellen Riloff, Janyce Wiebe, William Phillips. Exploiting subjectivity classification to improve information extraction [ C ]//Proceedings of AAAI-2005, 2005: 1106-1111.
  • 7Minqing Hu,Bing Liu. Mining opinion features in customer reviews[C]//Proceedings of AAAI-2004,2004: 755-760.
  • 8倪茂树,林鸿飞.基于关联规则和极性分析的商品评论挖掘[C]//第三届全国信息检索与内容安全学术会议,2007:635-642.
  • 9Soo-Min Kim,Eduard Hovy. Automatic detection of opinion bearing words and sentences[C]//Proceedings of IJCNLP-2005,2005 : 61-66.
  • 10Jun Zhao,Kang Liu,GenWang. Adding redundant features for crfs based sentence sentiment classification [C]//Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008: 117-126.

共引文献781

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部