期刊文献+

基于Scrapy框架的网络爬虫抓取实现 被引量:4

Implementation of Web Crawler Capture Based on Scrapy Framework
下载PDF
导出
摘要 随着互联网的发展,网络数据覆盖了各个领域,但随着网络数据量的增加和数据格式的多样化,用户从海量数据中获取有价值的数据变得越来越困难。目前国内外对数据采集技术进行了研究,发现通过网络爬虫技术可以自动获取网络资源。本文以南京市二手房信息为例,设计了一个基于Scrapy框架的爬虫程序,对中西部部分地区的二手房信息进行抓取和存储,最后运用Excel数据分析,对南京市二手房资源按区域、住房类型进行分析。结果表明,该程序能够自动采集安居客户的住房信息,提高了用户获取信息的速度和质量,为用户数据分析提供了数据源。 With the development of the Internet, network data covers various fields, but with the increase of the amount of network data and the diversification of data formats, users from the massive data to obtain valuable data becomes more and more difficult. At present, data acquisition technology is studied at home and abroad, and it is found that web crawler technology can automatically acquire network resources. This paper takes the secondhand housing information of Nanjing as an example, designs a crawler program based on Scrapy framework, captures and stores the second-hand housing information of some parts of the central and western regions, and finally uses Excel data analysis to analyze the second-hand housing resources of Nanjing according to regions and housing types. The results show that the program can automatically collect the housing information of the customer, improve the speed and quality of the user to obtain information, and provide data source for user data analysis.
作者 聂莉娟 方志伟 李瑞霞 NIE Lijuan;FANG Zhiwei;LI Ruixia(Jinken College of Technology,Nanjing Jiangsu 210000)
出处 《软件》 2022年第11期18-20,共3页 Software
基金 江苏省职教学会2021—2022年度职业教育研究课题《民办高职院校“专企融合、岗位分级实现梯队式教育”人才培养模式的实践研究》(XHYBLX2021010)。
关键词 Scrapy PYTHON 网络爬虫 大数据 Scrapy Python web crawler big data
  • 相关文献

参考文献5

二级参考文献29

共引文献37

同被引文献45

引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部