期刊文献+

一种并行的网页解析算法 被引量:1

A Parallel Algorithm for Webpage Parsing
下载PDF
导出
摘要 如今,Web应用已经可以提供接近传统桌面应用的用户体验,其网页也相应地变得更加复杂,从而对Web浏览器的性能提出了巨大挑战.传统的Web浏览器通常使用单一线程处理网页,无法充分利用多处理器设备的运算能力,针对于此提出了一种并行的网页解析算法.与现有针对网页处理的并行算法不同,本算法基于数据并行的方案,通过将输入数据划分成多个部分,对其进行并行处理,再合并各个部分的结果以得到最终结果.本算法可以充分利用现有的高度优化的串行网页处理算法,并且兼容现有的Web标准和技术.在Webkit浏览器引擎上进行的实验指出,本并行算法可以有效利用多核处理器的运算能力,显著提高了网页解析过程的速度. Web applications have become more complex and rich in user experiences that can compete with desktop applications. This poses great challenges to Web browsers, which traditionally process a Webpage in a single thread therefore cannot exploit the compu- ting power in modem multi-processor devices. This paper presents a parallel algorithm for Webpage parsing. Unlike the existing par- allel algorithms for Webpage processing, the algorithm proposed in this paper is based on the data parallel scheme. By partitioning the input data into several parts, then processing them in parallel and finally merging the partial results to generate the final results, this algorithm in this paper could leverage the existing highly optimized algorithms and be compatible with existing Web standards and technologies. The experimental results on the Webkit Web browser engine show that the parallel algorithm could dramatically speed up the Webpage parsing on a device with multi-core processors.
作者 张开敏
出处 《小型微型计算机系统》 CSCD 北大核心 2014年第2期193-198,共6页 Journal of Chinese Computer Systems
基金 国家"核高基"重大专项项目(2009ZX01028-002-003-005)资助 国家自然科学基金项目(60833004)资助 高等学校创新引智计划项目(B07033)资助
关键词 多核处理器 并行算法 超文本标记语言 万维网 解析 multi-core processor parallel algorithm hypertext markup language word wide Web parsing
  • 相关文献

参考文献23

  • 1Webkit. SunSpider JavaScript benchmark [ EB/OL]. http:// www. webkit, org/perf/sunspider/sunspider, html, 2010.
  • 2Meyerovich L A, Bodik R. Fast and parallel webpage layout[ C]. Proceedings of the 19th International Conference on World Wide Web, 2010: 711-720.
  • 3W3C. Web workers[ EB/OL ]. http://www, w3. org/TR/work- ers/, 2012.
  • 4Weber J. A closer look at Intemet explorer 9 hardware acceleration through flying images [ EB/OL 1. http ://blogs. msdn. com/b/ie/ archive/2O l O /O4 /O7 / a-closer-look-at-intemet-explorer-9 -hardware- acceleration-through-flying-images, aspx, 2010.
  • 53GPP. 3GPP-LTE [ EB/OL]. http://www. 3gpp. org/LTE,2012.
  • 6Wilton-Jones M. Efficient JavaScript[EB/OL]. http://dev, oper- a. com/articles/view/efficient-javascript/, 2008.
  • 7Rhea S C, Liang K, Brewer E. Value-based Web caching [ C ]. Proceedings of the 12th International Conference on World Wide Web, 2003: 619-628.
  • 8W3C. Document object model [ EB/OL ]. http ://www. w3. org/ DOM/, 2005.
  • 9Mai H, Tang S, King S T, et al. A case for parallelizing Web pa- ges[ C]. Proceedings of the 4th USENIX Conference on Hot Top- ics in Parallelism, 2012.
  • 10Yahoo. Best practices for speeding up your Web site [ EB/OL ]. http://developer, yahoo, com/performance/rules, html, 2008.

二级参考文献22

  • 1GROSSKURTH A, GODFREY M W. A reference architecture for Web browsers[ C]// Proceedings of the 21st IEEE International Conference on Software Maintenance. Washington, DC: IEEE Com- puter Society, 2005:661 -664.
  • 2CAMPOS A, LANE B, CLARK N, et al. Conceptual architecture of Firefox[ EB/OL]. [ 2011 - 02 - 10]. http://web, uvic. ca/- hitch- ner/assignl, pdf.
  • 3深圳市茁壮网络股份有限公司[EB/OL].[2011-02-14].http://www.ipane1.tv.2009.
  • 4YAMAKAMI T. A micro-component architecture approach for next generation embedded browsers[ C]// Proceedings of the Second In-tenlational Conference on Embedded Software and Systems. Washington, DC: IEEE Computer Society, 2005:102 - 109.
  • 5NF浏览器[EB/OL].[2011-02-10].http://www.access-com-pany.com.
  • 6SATYANARAYANAN M. Pervasive computing: vision and challenges[ J]. IEEE Personal Communications, 2002, 8(4) : 10 - 17.
  • 7刘海雄.嵌入式浏览器底层设计与实现[D].武汉:华中科技大学,2005.
  • 8The WebKit open source project[ EB/OL]. [ 2011 - 02 - 10]. http://www, webkit, org.
  • 9Cao P,Zhang J and Beach P B. Active cache: caching dynamic contents on the web [C]. Proc. Middleware'98 Conference,1988.
  • 10Tatarinov. Cache policies for web serversIgor[EB/OL]. http://www. cs. ndsu. nodak. edu/-tatarino/cache-policies. ps.

共引文献47

同被引文献4

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部