期刊文献+

基于Python的社交网站用户行为数据采集方法

A Python-based method for collecting user behavior data on social media sites
下载PDF
导出
摘要 传统数据采集方法存在适用范围较窄、重复性工作量较大等问题,导致社交网站用户行为数据采集效率较差,提出基于Python的社交网站用户行为数据采集方法。采用情境标记法确定社交网站用户行为数据采集时机,基于Python语言搭建一个以MJU采样算法为URL地址管理中心的Scrapy爬虫框架,执行Scrapy爬虫框架完成社交网站用户行为数据的采集流程。实验结果表明,本文方法在采集社交网站的用户行为数据时,采集速率为830个/h,验证了该方法具有快速性。 Traditional data collection methods have problems such as narrow applicability and large repetitive workload,resulting in poor efficiency in collecting user behavior data on social media sites.Therefore,a Python-based method for collecting user behavior data on social media sites is proposed.Using situational tagging to determine the timing of collecting user behavior data on social media sites,a Scrapy crawler framework based on Python language with MJU sampling algorithm as the URL address management center is constructed,and the Scrapy crawler framework is executed to complete the process of collecting user behavior data on social media sites.The experimental results show that the method proposed in this paper has a collection rate of 830 user behavior data per hour when collecting user behavior data from social media sites,which verifies the speed of the method.
作者 兰坤 吴琼 耿艳兵 LAN Kun;WU Qiong;GENG Yanbing(Department of Computer Teaching,Changzhi Medical College,Changzhi 046000,Shanxi,China;School of Data Science and Technology,North University of China,Taiyuan 030051,China)
出处 《智能计算机与应用》 2024年第6期219-223,共5页 Intelligent Computer and Applications
基金 山西省自然科学基金面上基金(202103021224192)。
关键词 PYTHON 社交网络 用户行为数据 数据采集方法 Python social networks user behavior data data collection methods
  • 相关文献

参考文献11

二级参考文献130

共引文献71

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部