期刊文献+

领域Web数据库查询接口的自动发现

Finding the Query Interface of Domain WDB Automatically
下载PDF
导出
摘要 由于Deep Web查询接口的自动发现是获取Deep Web信息的基础,提出了基于搜索引擎的表单发现方法和基于语义相似度的查询表单判定方法.该文首先定义了查询表单的特征的表示方法,然后自动提取样本查询接口的特征并将其中的文字特征进行组合提交给搜索引擎以实现查询表单的搜索,最后基于语义相似度和字面相似度的方法判定搜索表单和样本表单之间的相似度,并由此判定是否是WDB的查询接口.实验证明该文方法具有良好的可行性和实用性,为深入研究Deep Web提供了良好的条件. Because it is the foundation of obtaining Deep Web information that searching the query interface of domain WDB(Web Database) automatically, form discovery method based on SE (Search Engine) and the other is to judge whether the form is query interface. At first, we defined the representation format of the feature of query form, then it can be extracted automatically. After that, the literal feature of the form is composed and submitted to SE so that the re- lated pages, that majority include HTML form, can be retrieved. Finally, the query interface of WDB is decided by the semantic similarity and literal similarity between the form of query interface and the one of sample. The experiment shows that the method presented by this paper has good practicability and provides fine prerequisite for further research of deep web.
作者 林培光 吕超
出处 《江西师范大学学报(自然科学版)》 CAS 北大核心 2008年第2期197-200,共4页 Journal of Jiangxi Normal University(Natural Science Edition)
基金 山东财政学院博士科研启动基金(07BSJJ13) 山东财政学院科研基金资助项目
关键词 关键词 WEB数据库 查询接口 DEEP WEB Web Database query interface Deep Web
  • 相关文献

参考文献6

  • 1Ghanem T M, Aref W G. Databases Deepen the Web[ J]. IEEE Computer, 2004 , 73( 1 ) : 116-117.
  • 2Fetterly D, Manasse M, Najork M, et al. A large-scale study of the evolution of web pages[ C]. New York: ACM, 2003:669-678.
  • 3Chang K Chen-Chuan, He Bin, Li Chengkai, et al. Structured database on the web: Observations and Implications[ J]. SIGMOD Record, 2004,33(3) :61-70.
  • 4Liu Wei, Meng Xiao-feng, Meng Wei-yi. Vision-based Web Data Records Extraction[ C]. Chicago: Illinois, 2006 : 110-115.
  • 5Chang K C, He B, Li C, et al. Structured databases on the web: Observations and implications[J]. Sigmod Record,2004,33(3) :61-70.
  • 6Lage J P, da Silva A S, Golgher P B, et al. Automatic generation of agents for collecting hidden Web pages for data extraction[J]. Data & Knowledge Engineering, 2004,49 : 177-196.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部