摘要
为了快速地获取到微博中人际关系信息,根据网址的特点,文章提出了一种基于新浪微博的爬虫程序设计方法。本方法通过模拟登录新浪微博,实现抓取微博中由指定用户出发的关注对象的名称等信息;该程序利用解析关键路径,广度遍历等技术,匹配符合规定条件的人物名称,并抓取相关内容;最后对该程序又进一步地优化与改进。实验结果表明:本程序具有针对性强,数据采集速度合理,易推广开发,稳定性强等优点,为寻求人际关系的研究者提供了寻求微博用户关注者的方法,有利于对微博的后续数据挖掘研究。
In order to quickly obtain the interpersonal relationship information in Weibo,according to the characteristics of the website,the paper proposes a crawler program design method based on Sina Weibo.This method logs in Sina Weibo through simulation,and implements the information such as the name of the object of interest of the microblog starting from the specified user.The program uses techniques such as parsing critical paths,breadth traversal,etc.to match character names that meet the specified conditions and grasp Take related content;in the end,the program is further optimized and improved.The experimental results show that this program has the advantages of strong pertinence,reasonable data collection speed,easy promotion and development,and strong stability.It provides researchers seeking interpersonal relationships with methods for seeking Weibo users’attention and is beneficial to Weibo,which is conducive to the research on subsequent data mining of Weibo.
作者
胡海潮
Hu Haichao(Kunming University of Science and Technology,Kunming 650000,China)
出处
《无线互联科技》
2018年第9期40-42,共3页
Wireless Internet Technology
关键词
人际关系
新浪微博
模拟登录
关键路径
广度遍历
interpersonal relationship
Sina Weibo
analog login
critical path
breadth traversal