期刊文献+

基于DOM树和递归X-Y分割算法的Zone树模型 被引量:4

Zone Tree Model Based on DOM Tree and Recursive X-Y Cut Algorithm
下载PDF
导出
摘要 在分析DOM树的基础上提出一种基于DOM树和递归X-Y分割算法,可以根据网页的几何布局生成Zone树模型。描述了将Zone树模型和递归X-Y算法应用到文献数据检索的优越性,给出构建Zone树模型的算法。该模型主要用于在线文献的数据提取,具有速度快、准确性高等特点,优于目前大多数浏览器所采用的DOM树结构。 Taking into account of the characteristics of DOM tree, the paper presents a new Zone tree model bases on DOM tree and recursive x-y cut algorithm, which is generated by geometric layout and illustrates its advantage over DOM tree when it is applied to information retrieval and describes a Zone tree model algorithm. This model is mainly applied to extract bibliographic data from online articles. It is better than DOM tree used by most browsers with speed and high accuracy.
作者 黄歆 桑楠
出处 《计算机工程》 CAS CSCD 北大核心 2009年第5期53-55,共3页 Computer Engineering
关键词 HTML文档 DOM树 递归X-Y分割算法 Zone树 HTML document DOM tree recursive X-Y cut algorithm Zone tree
  • 相关文献

参考文献5

  • 1Ha J, Haralick R, Phillips I. Recursive X-Y Cut Using Bounding Boxes of Connected Components[C]//Proc. of the 3rd International Conference on Document Analysis and Recognition. Montreal, Canada: [s. n.], 1995: 952-955.
  • 2Chen Jinlin, Zhou Baoyao, Shi Jin, et al. Function-based Object Model Towards Website Adaptation[EB/OL]. (2001-05-01/ 2001-05-05 ). http://www 10.org/cdrom/papers/296/.
  • 3Cai Deng, Yu Shipeng, Wen Jirong, et al. VIPS: A Vision-based Page Segmentation Algorithm[R]. Beijing Microsoft Research, Technical Report: MSR-TR-2003-79, 2003.
  • 4Shafait E Keysers D, Breuel T M. Performance Comparison of Six Algorithms for Page Segmentation[J]. Document Analysis SystemsⅦ, 2006, 38(6): 368-379.
  • 5Cai Deng, Yu Shipeng, Wen Jirong, et al. Extracting Content Structure for Web Pages Based on Visual Representation[C]//Proc. of the 5th Asia Pacific Web Conference. Xi'an, China: [s. n.], 2003.

同被引文献26

引证文献4

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部