期刊文献+

基于Python的51-job数据抓取程序设计 被引量:6

Design on 51-job Data Scraping Program Based on Python
下载PDF
导出
摘要 为了快速地获取职位信息,根据"前程无忧"的网页特点,设计了3种基于Python的爬虫程序,进行职位相关数据的抓取。通过对关键字的提取,匹配符合条件的职位信息,并且抓取相关内容存入Excel文件中,便于寻找相关职位信息及具体要求。实验结果表明:该程序能够快速且大量地抓取相关职位信息,针对性强,简单易读,有利于对职位信息的进一步挖掘及分析。 In order to obtain job information quickly,according to the characteristics of web pages with"Worry-free Future",three kinds of Python-based crawler programs are designed to capture job-related data. Through the extraction of the keywords,the job information is matched,and the relevant content is captured in an Excel file,so that the related job information and specific requirements can be easily found. The experimental results show that this program can quickly and massively capture relevant job information,and it is highly targeted and easy to read,which is conducive to further mining and analysis of job information.
作者 崔玉娇 孙结冰 祁晓波 凌强 朱勇 CUI Yujiao;SUN Jiebing;QI Xiaobo;LING Qiang;ZHU Yong(School of Electronic Engineering,Heilongjiang University,Harbin 150080,China)
出处 《无线电通信技术》 2018年第4期416-419,共4页 Radio Communications Technology
关键词 PYTHON 爬虫 职位 前程无忧 Python crawler position Worry-free Future
  • 相关文献

参考文献12

二级参考文献79

  • 1周立柱,林玲.聚焦爬虫技术研究综述[J].计算机应用,2005,25(9):1965-1969. 被引量:153
  • 2刘慕涛,张磊,王艳,周晓中,张红雷,左芸.基于XML的API自动化测试工具设计与实现[J].计算机工程,2007,33(13):96-98. 被引量:13
  • 3EHRIG M, MAEDCHE A. Ontology-focused crawling of Web documents[A]. Proceedings of the 2003 ACM symposium on Applied computing[C], March 2003.
  • 4GUO Q, GUO H, ZHANG ZQ, et al. Schema Driven Topic Specific Web Crawling[A]. DASFAA[C], 2005.
  • 5GRAUPMANN J, BIWER M, ZIMMER C, et al. COMPASS: A Concept-based Web Search Engine for HTML, XML, and Deep Web Data[A]. Proceedings of the 30th VLDB Conference[C],2004.
  • 6QIN JL, ZHOU YL, CHAU M. Building domain-specific web collections for scientific digital libraries: a meta-search enhanced focused crawling method[A]. Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries[C], June 2004.
  • 7CHO J , GARCIA - MOLINA H , PAGE L . Efficient crawling through URL ordering[A]. Proceedings of the seventh international conference on World Wide Web 7[C], April 1998.
  • 8FLORESCU D, LEVY AY, MENDELZON AO. Database techniques for the world-wide web: A survey[J]. SIGMOD Record, 1998,27(3) :59 -74.
  • 9LAWRENCE S, GILES CL. Searching the World Wide Web[J].Science, 1998,280(5360):98.
  • 10CHAKRABARTI S, VAN DEN BERG M, DOM B. Focused crawling: A new approach to topicspecific web resource discovery[A].Proceedings of the Eighth International World-Wide Web Conference[C], 1999.

共引文献369

同被引文献37

引证文献6

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部