期刊文献+

Ajax站点数据采集研究综述 被引量:10

Overview of Research on Data Collection from Ajax Sites
原文传递
导出
摘要 从Ajax链接元素的识别、页面状态标识、页面状态可控性转换、页面状态内容动态获取和状态重复检测5个方面介绍Ajax数据采集所取得的最新研究进展,总结系统的整体处理流程和支撑技术,探讨新的发展趋势,推动Ajax数据采集问题展开更为深入的研究。 This paper introduces the recent advances achieved from five aspects, which include Ajax link elements judgment, page state identification, page state controllable transformation, content extraction and duplicated states detection. The overall processing flow and the relevant supporting technologies are summarized, and the new research trends are discussed. This study will be helpful to promote the further research on Ajax data collection issues.
作者 夏天
出处 《现代图书情报技术》 CSSCI 北大核心 2010年第3期52-57,共6页 New Technology of Library and Information Service
基金 国家社会科学基金项目"Web2.0环境下的网络舆情采集与分析"(项目编号:09CTQ027) 中国人民大学科学研究基金项目"Web2.0网站的数据采集研究"(项目编号:22382078)的研究成果之一
关键词 数据采集 Ajax网络爬虫 HTML渲染器 WEB2.0 Data collection Ajax crawler HTML renderer Web2.0
  • 相关文献

参考文献30

  • 1Garrett J. Ajax: A New Approach to Web Applications[ EB/OL]. (2005 - 02 -18 ). [ 2010 - 01 - 15 ]. http ://www. adaptivepath.com/ideas/essays/archives/000385.php.
  • 2Mesbah A, Van Deursen A. An Architectural Style for Ajax [ C ]. In: Proceedings of the 6th Working IEEE/IFIP Conference on Software Architecture,Mumbai, India. Washington, DC, USA :IEEE Computer Society ,2007 : 44 - 53.
  • 3Bozdag E, Mesbah A, Van Deursen A. A Comparison of Push and Pull Techniques for Ajax [ C ]. In : Proceedings of the 9th IEEE International Symposium on Web Site Evolution, Paris, France. 2007:15 - 22.
  • 4Mesbah A, Van Deursen A. Exposing the Hidden - Web Induced by Ajax [ R/OL ]. [ 2009 - 08 - 01 ]. http://swerl. tudelft. twiki/pub/Main/TechnicalReports/TUD - SERG - 2008 - 001. pdf.
  • 5Frey G. Indexing Ajax Web Applications[ D]. Zurich: Swiss Federal Institute of Technology Zurich, 2007.
  • 6Matter R. Ajax Crawl: Making Ajax Applications Searchable[ D]. Zurich : Swiss Federal Institute of Technology Zurich, 2008.
  • 7Mesbah A, Bozdag E, Van Deursen A. Crawling Ajax by Inferring User Interface State Changes[ C]. In: Proceedings of the 8th International Conference on Web Engineering, Yorktown Heights, NJ. Washington, DC, USA: IEEE Computer Society, 2008:122 - 134.
  • 8郭浩,陆余良,刘金红.一种基于状态转换图的Ajax爬行算法[J].计算机应用研究,2009,26(11):4266-4269. 被引量:6
  • 9Duda C, Frey G, Kossmann D, et al. AjaxSearch : Crawling, Indexing and Searching Web 2.0 Applications [ J ]. Proceedings of the VLDB Endowment Archive, 2008, 1 (2) : 1440 - 1443.
  • 10夏冰 高军 王腾蛟 等.一种商效的动态脚本网站有效页面获取方法.软件学报,2009,20:176-183.

二级参考文献13

  • 1Jesse James Garrett. Ajax: A New Approach to Web Applications.http://www.adaptivepath.com/ideas/essay s/archives/000385.php2005.
  • 2Alvarez M, Pan A, Raposo J, Vina A. Client-Side Deep Web Data Extraction ext ended paper, http://www.tic. udc.es/-mad/publications/csdeepweb_extended.pdf.
  • 3Steindl C. Program slicing for object-oriented programmming languages [PhD Thesis]. Johannes Kepler University Linz. 1999.
  • 4Weiser M. Program slicing. IEEE Transactions on Software Engineering, July1984.
  • 5Ottenstein K J, Ottenstei LM. The program dependence graph in a softwaredevelopment environment. Proceedings of the ACM SIGSOFT/SIGP LAN software Engineering Sysposium on Practical Software Development Environments, ACM SIGPLAN Notices. 1984,19(5).
  • 6GARRETT J J. Ajax:a new approach to Web applications[ EB/OL]. (2005) [ 2008-10-08 ]. http ://www. adaptivepath.com/publications/ essays/archives/000385, php.
  • 7Ajaxian Community. Ajax tools usage survey results [ EB/OL ]. (2007). [ 2008- 10- 09 ]. http://ajaxian.com/archives/2007-ajaxtools-usage-survey-results/.
  • 8SHAH S. Crawling Ajax-driven Web 2.0 applications[ R]. 2007.
  • 9FREY G. Indexing Ajax Web applications [ D ]. Zurich: Swiss Federal Institute of Technology, 2007.
  • 10MATTER R. Ajax crawl:making Ajax applications searchable [ D ]. Zurich : Swiss Federal Institute of Technology, 2008.

共引文献12

同被引文献188

引证文献10

二级引证文献83

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部