期刊文献+

Web页面列表信息的自主抽取

Autonomous Information Extraction from Web Pages Base on Structure of List
下载PDF
导出
摘要 本文研究了对于Web页面列表信息的抽取方法。通过对超文本文档特征的分析获取抽取知识,并通过自学习适应页面的变化,实现了对于列表信息的抽取。 The paper studied autonomous information extraction from Web pages based on structure of list. Gettingextraction knowledge according to the analysis of Web pages' feature, wrapper can adapt to the pages' changes with self-learning and make it automatic extraction effectively.
作者 侯锟 罗海龙
出处 《科技广场》 2007年第3期117-118,共2页 Science Mosaic
关键词 信息抽取 包装器 文档对象模型 Information Extraction Wrapper Document Object Model
  • 相关文献

参考文献3

二级参考文献9

  • 1[1]Joachim Hammer, Hector Garcia-Molina, Jumghoo Cho, et al.Extracting Semistructured Information from the Web [C].Proceedings of the First Workshop on Management of Semistructured Data, Tucson, Arizona, 1997.18-25.
  • 2[2]Arnaud Sahuguet, Fabien Azavant. Building Light-weight Wrap-pers for Legacy Web Data-sources Using W4F[C]. International Conference on Very Large Databases (VLDB), Edinburgh,Scotland, 1999.738-741.
  • 3[3]S Soderland. Learning Information Extraction Rules for Semi-structured and FreeText [ J ]. Machine Learning, 1999, 1-44.
  • 4[4]N Kushmerick, D Weld, B Doorenbos. Wrapper Induction for Information Extraction [ C ]. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97), Osaka, Japan, 1997.729-737.
  • 5[5]Ion Muslea, Steve Minton, Craig Knoblock. Stalker: Learning Extraction Rules for Semistructured, Web-based Information Sources [ C ]. AAAI-98 Workshop on "AI & Information Integration", Madison, 1998.74-81.
  • 6[6]Ion Muslea. Extraction Patterns: From Information Extraction to Wrapper Induction[ R]. Technical Report, Information Sciences Institute, University of Southern Californi, 1998.
  • 7Hammer J,Proceedings of the Workshop on Management of Semistructured Tucson,1997年,18~25页
  • 8王继成,潘金贵,张福炎.Web文本挖掘技术研究[J].计算机研究与发展,2000,37(5):513-520. 被引量:275
  • 9王继成,萧嵘,孙正兴,张福炎.Web信息检索研究进展[J].计算机研究与发展,2001,38(2):187-193. 被引量:118

共引文献46

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部