期刊文献+

基于微博行为数据的不活跃用户探测 被引量:2

Detecting Inactive Users from Behavior Data Based on Weibo
下载PDF
导出
摘要 随着微博注册用户的增长,探测不活跃账号,自动判定用户活跃度有重要的商业价值。该文提出了一种自动检测算法并通过实验验证。算法核心是提出的影响用户活跃度的4个判定因子,可由用户行为计算得到。算法包含用户活跃度概率层次模型(ADPHM)和用户评分模型(USM)。ADPHM模型计算用户是不活跃用户的概率;USM模型计算用户活跃度得分。实验数据集包含了新浪微博2 316 281个用户信息和141 322 019条微博内容。实验结果表明,该算法能在线性时间复杂度下自动检测出不活跃账号,完善用户可信度评估体系。 With the growth of registered users in microblog, how to detect inactive accounts and automatically judge the user activity have an important commercial value. To meet this need, an automatic detection algorithm is proposed and experimentally tested. The kernel of automatic detection algorithm is four determining factors of inactive users we defined, which can be calculated by user’s behavior. The algorithm contains User Active Degree Probability Hierarchical Model (ADPHM) and User Scoring Model (USM). The ADPHM is employed to estimate the probability of inactive user;the USM is used to give a user's activity score. Experiment data contains 2 316 281 users’ information and their 141 322 019 tweets crawled from Sina-Weibo. Experimental results show that this method can detect inactive users automatically and improve user confidence evaluation system in linear time complexity.
出处 《电子科技大学学报》 EI CAS CSCD 北大核心 2015年第3期410-414,444,共6页 Journal of University of Electronic Science and Technology of China
基金 国家自然科学基金(61272109) 中央高校基本科研业务费专项资金(CZY15006)
关键词 活跃度 自动识别 不活跃用户 微博 社交网络 activity automatic identification inactive users microblog social network
  • 相关文献

参考文献11

  • 1丁兆云,周斌,贾焰,汪祥.微博中基于统计特征与双向投票的垃圾用户发现[J].计算机研究与发展,2013,50(11):2336-2348. 被引量:11
  • 2STRINGHINI Cg KRUEGEL C, VIGNA G. Detecting spammers on social networks[C]//Proceedings of the 26th Annual Computer Security Applications Conference. New York, USA: ACM, 2010: 1-9.
  • 3WANG A H. Don't follow me: Spam detection in twitter[C]//Proceedings of the 2010 International Conference on Security and Cryptography. Washington, USA: IEEE Press, 2010: 1-10.
  • 4SOFUS A. MACSKASSY. On the study of social interactions in twitter[C]//Proceedings of the 6th International AAAI Conference on Weblogs and Social Media. Dublin, USA: AAAI Press, 2012.
  • 5REZA Z, MOHAMMAD-AMIN J, HAMIDREZA B, et al. A novel approach for social behavior analysis of the blogosphere[C]//Proceedings of the 21st Conference of the Canadian Society for Computational Studies of Intelligence. Windsor: Springer-Verlag, 2008: 356-367.
  • 6BOYD D, GOLDER S, LOTAN G. Tweet, tweet, retweet: Conversational aspects of retweeting on twitter[C]// Proceedings of the 43rd Hawaii International Conference on System Sciences. Honolulu, USA: IEEE Press, 2010.
  • 7LAS-CASAS PHB, GUEDES D, ALMIDA J M, et al. SpaDeS: Detecting spammers at the source network[J]. Computer Networks, 2012, 57(2): 526-539.
  • 8莫倩,杨珂.网络水军识别研究[J].软件学报,2014,25(7):1505-1526. 被引量:55
  • 9LIM E P, NGUYEN V A, JINDAL N, et al. Detecting product review spammers using rating behaviors[C]// Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM 2010). New York, USA: ACM Press, 2010: 939-948.
  • 10AKOGLU L, CHANDY R, FALOUTSOS C. Opinion fraud detection in online reviews by network effects[C]// Proceedings of the 7th International Conference on Weblogs and Social Media (ICWSM 2013). Menlo Park: AAAI Press, 2013: 2-11.

二级参考文献26

  • 1张泽明,罗文坚,王煦法.一种基于人工免疫的多层垃圾邮件过滤算法[J].电子学报,2006,34(9):1616-1620. 被引量:16
  • 2中国互联网络信息中心.中国互联网络发展状况统计报告[EB/OL].http://www.cnnic net.cn,2003—07-01.
  • 3Kwak H. Lee C. Park H. et al. What is twitter. a social network or a news media? [C] / /Proc of the 19th Int World Wide Web Conf. New York, ACM. 2010, 591-600.
  • 4Yin D. Hong L. Xiong X. et al. Link formation analysis in microblogs [C] / /Proc of the 34th Annual Int ACM SIGIR Conf on Information Retrieval. New York, ACM. 2011, 1235-1236.
  • 5Becchetti L. Boldi P. Castillo C. er al. Efficient semistreaming algorithms for local triangle counting in massive graphs [C] / /Proc of the 14th ACM SIGKDD Int Conf On Knowledge Discovery and Data Mining. New York, ACM. 2008, 16-24.
  • 6Tsourakakis C. Fast counting of triangles in large real networks without counting, Algorithms and laws [C] / /Proc of the 8th IEEE Int Conf on Data Mining. Piscataway. NJ, IEEE. 2008, 608-617.
  • 7Gyongyi Z, Garcia-Molina H. Pedersen J. Combating Web sparn with TrustRank [C] / /Proc of the 30th Int Conf on Very Large Data Bases. San Franciso . Morgan Kaufmann, 2004, 576-587.
  • 8Sobek M. PRO-Google's PageRank 0 penalty [EB/OL]. (2003-01-31) [2012-07-28]. http://pr. efactory. dele-prO. shtml.
  • 9Wu B. Goel V. Davison B. Propagating trust and distrust to demote Web sparn [C] / /Proc of Models of Trust for the Web Workshop of 15th Int World Wide Web Conf. New York, ACM. 2006, 29-37.
  • 10Chu Z. Gianvecchio S. Wang H. et al. Who is tweeting on twitter, Human. bot. or cyborg? [C] / /Proc of the 26th Annual Computer Security Applications Conf. New York, ACM. 2010, 21-30.

共引文献63

同被引文献32

引证文献2

二级引证文献62

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部