摘要
微博作为一种SNS服务的典型案例,具有拥有用户多,信息传播速度快的特点。由于当前微博阅读的局限性,用户很难看清微博传播发展的全貌,针对这一问题,文章利用新浪云计算平台TaskQueue的分布式数据采集程序获取微博数据,将其可视化为半径树的形式表达,提高了微博的可读性。实验表明,微博的采集速度可达300条每秒,对于30万以内的微博数据,绘图过程可在5秒内完成。
Micro blog as a typical case of SNS, it have more users, information spread speed characteristics. There is a problem of reading it in current that user it becomes difficult to see it spread development panorama.The paper mainly focuses on web crawler and data visualization to deal with the problems of weiboinformation's disseminating channels. We design and implement a web crawler and visualization system for Weibo based on Sina App Engine and visualizes the data to graphs. Laboratory tests show that this system can am stably that it can gather weibos 300 per second and can complete drawing process of 300 thousands data in 5 seconds.
出处
《软件》
2012年第7期117-119,122,共4页
Software
关键词
微博
分布式
数据采集
可视化
Microblog
distributed
web crawler
visualization