

Research and Application of Web Data Mining Technology in Chemical Deep Web
摘要 针对化学和化工领域深层网信息量大、专业性强,但是难于检索的问题,本文研究了深层网信息挖掘的相关技术及化学和化工深层网的特点,并将其综合应用于对化学和化工深层网信息资源的挖掘系统中。该系统通过提取表单标签并结合化工物性词典种子合成绝对URL地址的方式,实现了对深层网入口表单的自动填写和提交功能,采用结合了XPath文档定位语言和XSLT数据逻辑处理模式初步实现了对返回的结果页面中化学和化工数据的提取。 In view of problem of being informative and professional but difficult to retrieve for deep web in chemical field, related technology of web data mining and characteristics of chemical deep web are studied, and applies to the data mining system of chemical deep web information resources.This system can automatically fill out and submit the entrance form of deep web by extracting the form tags in combination with chemical properties dictionary seed to compose absolute URL address, using XPath document positioning language and XSLT logistic data manage model to extract the useful chemical information in the result pages.
出处 《微计算机信息》 2010年第9期151-153,共3页 Control & Automation
关键词 化学和化工深层网 信息挖掘 自动提交表单 信息抽取 Chemical deep web data mining submit form automatically Information Extraction
  • 相关文献



  • 1王亮,朱征宇.基于扩展标记图的Web信息抽取器[J].计算机工程,2005,31(8):159-161. 被引量:2
  • 2储春梅,李晓霞,郭力.定向查询引擎在Web化学数据库集成检索中的应用[J].计算机与应用化学,2005,22(8):659-666. 被引量:12
  • 3郭志鑫.基于本体的文档引文元数据信息抽取[J].微计算机信息,2006,22(06X):304-306. 被引量:18
  • 4贡正仙,朱巧明,李培峰.基于相似页面的Web信息抽取系统的实现[J].计算机应用,2006,26(8):1983-1986. 被引量:3
  • 5Cohen W, Hurst M, Jensen L. A flexible learning system for wrapping tables and lists in HTML documents [ C ]//Proceedings of the Eleventh International World Wide Web Conference. 2002:232-241.
  • 6Blei D, Bagnell J, McCal-lumA. Learning with scope, with application to information extraction and classification[ C ]// Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intellig-ence. 2002:53-60.
  • 7Wong T L,Lam W. A probabilistic approach for adapting wrapper and discovering new attributes [ C ]// Proceedings of the Fourth IEEE International Conference on Data Mining. 2004:257-264.
  • 8Crescenzi V, Mecca G, Merialdo P. ROADRUNNER: Towards automatic data extraction from large Web sites [ C ]// Proceedings of the 27th Very Large Databases Conference. 2001:317-328.
  • 9Laender H F, Ribeim-Neto B A, da Silva A S, et al. A brief survey of Web data extraction tools[ J]. SIGMOD Record,2002,31 (2) : 84-93.
  • 10W3C. Document Object Model (DOM) Level 1 Specification , Version 1.0[ EB/OL ] . W3C Recommendation, 1998 - 10-01









使用帮助 返回顶部