期刊文献+

基于关系图特征的微博水军发现方法 被引量:25

Graph-based Features for Identifying Spammers in Microblog Networks
下载PDF
导出
摘要 随着网络水军策略的不断演变,传统的基于用户内容和用户行为的发现方法对新型社交网络水军的识别效果不断下降,水军用户可以变更自身的博文内容与转发行为,但无法改变与网络中正常用户的连结关系,形成的结构图具有一定的稳定性,因此,相对于用户的内容特征与行为特征,用户关系特征在水军识别中具有更强的鲁棒性与准确度.由此,本文提出一种基于用户关系图特征的微博水军账号识别方法.实验中通过爬虫程序抓取新浪微博网络数据;然后,提取用户的属性特征、时间特征、关系图特征;最后,利用三种机器学习算法对用户进行分类预测.仿真结果表明,添加新特征后对水军账号的识别准确率、召回率提高5%以上,从而验证了关系图特征在水军识别中的有效性. With the evolution of spammer strategy,traditional methods of identifying spammer based on content and behavior are becoming hard to find new social networks spammers.Microblog users can change their own blog contents and forwarding behaviors to escape from detecting,but it is difficult to change the relationship with the normal users.The relationship graph between spammers and normal users is relatively stable.Thus,the relationship graph is more robust and accurate in detecting microblog spammers,as compared with content-based features and behavior-based features.This paper proposes a method of detecting microblog spammers based on the user relationship graph.Our experiment used the network datasets by Sina microblog crawler,and then extracted user's attribute feature,time feature and relationship graph feature.Finally,three machine learning algorithms were used to identify the spam accounts.Simulation results show that with the new features,the accuracy and recall of existing methods can be improved by more than 5%,which verifies the validity of relationship graph features in detecting microblog spammers.
出处 《自动化学报》 EI CSCD 北大核心 2015年第9期1533-1541,共9页 Acta Automatica Sinica
基金 国家高技术研究发展计划(863计划)(2011AA010604) 国家科技重大专项(2013ZX03006002)资助~~
关键词 词微博网络 机器学习 网络水军 图特征 分类器 Microblog network machine learning spammers graph-based feature classifier
  • 相关文献

参考文献23

  • 1Almeida T A, Yamakami A. Content-based spam filtering. In: Proceedings of the 2010 International Joint Conference on Neural Networks. Barcelona: IEEE, 2010. 1-7.
  • 2Zhang L, Zhu J B, Yao T S. An evaluation of statistical spam filtering techniques. ACM Transactions on Asian Language Information Processing, 2004, 3(4): 243-269.
  • 3曹建平,王晖,夏友清,乔凤才,张鑫.基于LDA的双通道在线主题演化模型[J].自动化学报,2014,40(12):2877-2886. 被引量:15
  • 4刘鸿宇,赵妍妍,秦兵,刘挺.评价对象抽取及其倾向性分析[J].中文信息学报,2010,24(1):84-88. 被引量:99
  • 5Jindal N, Liu B, Lim E P. Finding unusual review patterns using unexpected rules. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management. New York, United States: ACM, 2010. 1549-1552.
  • 6Ott M, Choi Y, Cardie C, Hancock J T. Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Stroudsburg, PA, USA: ACL, 2011. 309-319.
  • 7Niu Y, Wang Y M, Chen H, Ma M, Hsu F. A quantitative study of forum spamming using context-based analysis. In: Proceedings of the 2007 Network and Distributed System Security Symposium. San Diego, United States: ISOC, 2007. 1-14.
  • 8毛佳昕,刘奕群,张敏,马少平.基于用户行为的微博用户社会影响力分析[J].计算机学报,2014,37(4):791-800. 被引量:77
  • 9Hayati P, Chai K, Potdar V. Computational Science and Its Applications---ICCSA2010. Berlin, Heidelberg: Springer, 2010. 351-360.
  • 10Song J, Lee S, Kim J. Recent Advances in Intrusion Detection. Berlin. Heidelberg: Springer, 2011. 301-317.

二级参考文献196

  • 1郑智斌,邓兰花.网络个人信源及其可信度分析[J].情报理论与实践,2008,31(6):857-859. 被引量:8
  • 2王飞跃,王珏.情报与安全信息学研究的现状与展望[J].中国基础科学,2005,7(2):24-29. 被引量:18
  • 3郭岩,白硕,杨志峰,张凯.网络日志规模分析和用户兴趣挖掘[J].计算机学报,2005,28(9):1483-1496. 被引量:62
  • 4周树德,孙增圻.分布估计算法综述[J].自动化学报,2007,33(2):113-124. 被引量:209
  • 5姚天昉,聂青阳,李建超,李林琳,陈柯,付宁.一个用于汉语汽车评论的意见挖掘系统[C]//中文信息处理前沿进展-中国中文信息学会二十五周年学术会议论文集.北京:清华大学出版社,2006:260-281.
  • 6Hong Yu, Vasileios Hatzivassiloglou. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences [C]//Proceedings of EMNLP 2003,2003: 129-136.
  • 7Ellen Riloff, Janyce Wiebe, William Phillips. Exploiting subjectivity classification to improve information extraction [ C ]//Proceedings of AAAI-2005, 2005: 1106-1111.
  • 8Minqing Hu,Bing Liu. Mining opinion features in customer reviews[C]//Proceedings of AAAI-2004,2004: 755-760.
  • 9倪茂树,林鸿飞.基于关联规则和极性分析的商品评论挖掘[C]//第三届全国信息检索与内容安全学术会议,2007:635-642.
  • 10Soo-Min Kim,Eduard Hovy. Automatic detection of opinion bearing words and sentences[C]//Proceedings of IJCNLP-2005,2005 : 61-66.

共引文献372

同被引文献125

引证文献25

二级引证文献113

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部