摘要
随着网络水军策略的不断演变,传统的基于用户内容和用户行为的发现方法对新型社交网络水军的识别效果不断下降,水军用户可以变更自身的博文内容与转发行为,但无法改变与网络中正常用户的连结关系,形成的结构图具有一定的稳定性,因此,相对于用户的内容特征与行为特征,用户关系特征在水军识别中具有更强的鲁棒性与准确度.由此,本文提出一种基于用户关系图特征的微博水军账号识别方法.实验中通过爬虫程序抓取新浪微博网络数据;然后,提取用户的属性特征、时间特征、关系图特征;最后,利用三种机器学习算法对用户进行分类预测.仿真结果表明,添加新特征后对水军账号的识别准确率、召回率提高5%以上,从而验证了关系图特征在水军识别中的有效性.
With the evolution of spammer strategy,traditional methods of identifying spammer based on content and behavior are becoming hard to find new social networks spammers.Microblog users can change their own blog contents and forwarding behaviors to escape from detecting,but it is difficult to change the relationship with the normal users.The relationship graph between spammers and normal users is relatively stable.Thus,the relationship graph is more robust and accurate in detecting microblog spammers,as compared with content-based features and behavior-based features.This paper proposes a method of detecting microblog spammers based on the user relationship graph.Our experiment used the network datasets by Sina microblog crawler,and then extracted user's attribute feature,time feature and relationship graph feature.Finally,three machine learning algorithms were used to identify the spam accounts.Simulation results show that with the new features,the accuracy and recall of existing methods can be improved by more than 5%,which verifies the validity of relationship graph features in detecting microblog spammers.
出处
《自动化学报》
EI
CSCD
北大核心
2015年第9期1533-1541,共9页
Acta Automatica Sinica
基金
国家高技术研究发展计划(863计划)(2011AA010604)
国家科技重大专项(2013ZX03006002)资助~~
关键词
词微博网络
机器学习
网络水军
图特征
分类器
Microblog network
machine learning
spammers
graph-based feature
classifier