摘要
随着Web 2.0的出现以及社交网络的快速发展,在线行为的研究日益重要.故编制定向爬虫从2010年10月开始每日抓取天涯论坛.文章基于抓取的2012年天涯杂谈板块的数据,研究在线行为规律.数据分析结果表明节假日及周末用户的发帖量减少;用户的发帖行为符合日常作息规律,有显著的日历效应;点击量满足泊松分布与幂律分布的混合分布;用户发帖量,回复量和生存期均满足幂律分布.说明只有少数的热帖具有较高的点击量或回复量和较长的生存期,大部分的帖子缺乏关注.提出一个帖子的热度计算公式并编制热帖推送程序.研究发现更新帖中的热帖维持稳定.进一步对这些热帖进行了社会风险分类.
With the emergence of Web 2.0 and the rapid development of social networking, it is important to study online behavior. We started to download posts from Tianya Forum using spider program since October 2010. This paper analyzes the pattern of online behavior based on the posts at Tianya Zatan Board in 2012. The results show that users have fewer online activities on holidays and weekends, and users' posting behavior is in accordance with daily routine. Furthermore, the distribution of clicks follows a mixed distribution of Poisson and power-law, while the amount of user's posts, replies and survival periods satisfy power-law distribution. That is to say, only a few hot posts have high clicks or replies and long survival
出处
《系统科学与数学》
CSCD
北大核心
2015年第2期129-141,共13页
Journal of Systems Science and Mathematical Sciences
基金
国家重点基础研究发展计划项目(2010CB731405)
国家自然科学基金(71171187
71371107
61473284)资助课题
关键词
在线行为
天涯论坛
幂律分布
热帖
Online behavior, Tianya Forum, power-law, hot posts