期刊文献+

正则表达式在数据抓取中的应用研究 被引量:3

Research on the Application of Regular Expression in Data Grabbing
下载PDF
导出
摘要 在分析网站结构的基础上,采用Python语言,设计正则表达式,分析获取网站具体数据页面的链接,进而对其中的资源进行数据抓取。正则表达式能有效地抓取需要的数据,是大数据采集的一种较好的解决方案。 On the basis of analyzing the structure of the web site, using the Python language, the regular expression is designed, and the link of the specific data page is obtained. Regular expression can effectively capture the required data, is a good solution for large data collection.
出处 《佳木斯职业学院学报》 2017年第4期408-,共1页 Journal of Jiamusi Vocational Institute
基金 国家级大学生创新创业训练项目(201635108003) 福建省省教育厅科研项目(JAT160624)
关键词 PYTHON 正则表达式 数据抓取 Python Regular Expression Data grabbing
  • 相关文献

参考文献1

二级参考文献41

  • 1Roesch M.Snort-lightweight intrusion detection for networks. Proceedings of LISA . 1999
  • 2Snort-the de facto standard for intrusion detection/prevention. http://www.snort.org . 2009
  • 3TippingPoint IPS. http://www.tippingpoint.com . 2009
  • 4Cisco IOS IPS. http://www.cisco.com . 2009
  • 5Smith R,Estan C,Jha S.XFA:Faster signature matching with extended automata. Proceedings of IEEE Symposium on Security and Privacy 2008 . 2008
  • 6Smith R,Estan C,Jha S,et al.Deflating the big bang:fast and scalable deep packet inspection with extended finite automata. Proceedings of ACM SIGCOMM . 2008
  • 7Kumar S,Chandrasekaran B,Turner J,et al.Curing regular expressions matching algorithms from insomnia,amnesia, and acalculia. Proceedings of ACM/IEEE ANCS . 2007
  • 8Hua N,Song H,Lakshman T V.Variable-stride multi-pattern matching for scalable deep packet inspection. Proceedings IEEE INFOCOM . 2009
  • 9Lunteren J.High performance pattern-matching for intrusion detection. Proceedings IEEE INFOCOM . 2006
  • 10Song T,Zhang W,Wang D,et al.A memory efficient multiple pattern matching architecture for network security. Proceedings IEEE INFOCOM . 2008

共引文献11

同被引文献13

引证文献3

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部