期刊文献+

基于树结构的Web表格信息抽取方法 被引量:1

Information Extraction Method over Web Tables Based on Tree
下载PDF
导出
摘要 针对目前国内外多种信息抽取方法中存在不同程度的局限性,提出一种基于DOM树和二叉树结构的Web表格信息抽取方法.该方法提供了以Web表格为信息抽取对象的、支持抽取方式选择的Web表格信息抽取工具.该工具将Html文档解析成DOM树,再将DOM树构建成一棵含有文本信息的二叉树,最后通过遍历二叉树实现对Web表格信息的抽取. Aiming at the limitations in different degrees in various information extraction methods at home and abroad at present,an information extraction method over we b-tables based on DOM tree and binary tree was put forward.The method provided a web-table information extraction tool which the web-table was used as inform ation extraction objects and the choice of extraction modes was supported.The t ool parsed Html documents into DOM tree,then constructed a DOM tree into a bina ry tree containing texts,finally the information extraction of web-table was a chieved by traversing a binary tree.
出处 《华北水利水电学院学报》 2011年第3期108-110,共3页 North China Institute of Water Conservancy and Hydroelectric Power
基金 河南省教育厅科技攻关项目(2011B510008)
关键词 表格信息 HTML文档 DOM树 二叉树 table information Html document DOM tree binary tree
  • 相关文献

参考文献3

二级参考文献43

  • 1李向阳,苗壮.自由文本信息抽取技术[J].情报科学,2004,22(7):815-821. 被引量:23
  • 2邓尚民,孙玉伟.信息抽取系统的研究现状[J].现代图书情报技术,2006(3):55-58. 被引量:23
  • 3吴振慧.Web信息抽取的研究[J].电脑知识与技术,2006(12):21-21. 被引量:1
  • 4杨明福.计算机网络[M].北京:电子工业出版社,1999.123-127.
  • 5Lawrence S, Giles C L. Searching the world wide web [J]. Science, 1998, 280 (4): 98-100,
  • 6Grishman R, Sundheim B. message Understanding Conference on Computational Linguistics COLING - 96, 1996 - 08.
  • 7http://www.cymfony.com/index.html[EB]. 2007. 5
  • 8http://www.bhasha.com/[EB].2007.5.
  • 9http://www.linguamaties.com/index.html [EB].2007. 5.
  • 10http://www.revsolutions.com/index.html [EB]. 2007. 5.

共引文献66

同被引文献12

  • 1CASTRO J L, DELGADO M, MEDINA J. Intelligent surveillance system with integration of heterogeneous information for intru- sion detection [J]. Exp Sys Appl, 2011,38(9) :11182-11192.
  • 2LUO Z H, WU J T. The integration of directional information and local region information for accurate image segmentation[J]. Pat Recong Lett, 2011,32 (15) : 1990-1997.
  • 3DAVID G, IGOR A. Accuracy and performance of the state-based and liveliness measures of information integration [ J ]. Cons Cogn, 2011,20(4) :1403-1424.
  • 4ZHOU L N, AMMAR S M, ZHANG D S. Mobile persona informationl management agent: supporting natural language interface and application integration [ J ]. Inform Proe Manage, 2012,48 ( 1 ) : 23 -31.
  • 5SHI L, ROSSITZA S. User-oriented ontology-based clustering of stored memories [ J]. Expert Sys Appl, 2012,39 (10) :9730- 9742.
  • 6CARMEN M, ALBERT V D H, DANIEL S. An approximation to the computational theory of perceptions using ontologies [ J ]. Expert Sys Appl, 2012,39 (10) :9494-9503.
  • 7JEF P, PETER V P. Measuring integration of information and communication technology in education : An item response mod- eling approach[ J]. Comput Edu, 2012,58 (4) : 1247-1259.
  • 8HSIEH S H, LIN H T, CHIN W, et al. Enabling the development of base domain ontology through extraction of knowledge from engineering domain handbooks [ J ]. Adv Engin Inform, 2011,25 (2) :288-296.
  • 9万年红.面向服务的自适应云资源信息集成软件架构[J].计算机应用,2012,32(1):170-174. 被引量:7
  • 10岳洋,曾广平.一种面向构件的行为语义模型及其应用研究[J].计算机应用研究,2012,29(5):1751-1755. 被引量:4

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部