期刊文献+

基于搜索引擎的Deep Web数据源发现技术 被引量:2

Deep Web Source Discovery Based on Search Engine
下载PDF
导出
摘要 随着Web数据库的广泛应用,Web正在不断"深化"。传统搜索引擎只能检索浅层网络,却不能直接索引到深层网络(Deep Web)的资源。为了有效地利用Deep Web资源,必须要对Deep Web数据进行大规模集成。其中,数据源发现是整合Deep Web资源的首要工作,能否高效地发现Deep Web站点是Deep Web数据获取的关键。提出了一种基于传统搜索引擎的Deep Web数据源发现方法,该方法通过分析返回结果来扩展查询,从而进一步提高了数据源发现的效率。实验证明该方法能得到较好的结果。 With the prevalence of Web database,Web is continuously deepened. Traditional search engines are only able to reach surface Web except for deep Web source. To make use of deep Web source efficiently,must perform scalable integration on deep Web data. In this case, data source discovery is a chief task. It's crucial to obtain deep Web data that whether or not locates the deep Web efficiently. Describes a deep Web source discovery method based on traditional .search engine. It further improves the efficiency of data source discovery by query expansion via result analysis. The test gets good result.
出处 《计算机技术与发展》 2008年第8期58-60,64,共4页 Computer Technology and Development
基金 国家自然科学基金项目(60673092) 2005年度教育部科研重点项目(205059) 教育部"高校博士学科点科研基金项目"(20040285016) 江苏省高技术研究计划项目(BG2005019)
关键词 搜索引擎 DEEP WEB 网页表单 查询扩展 search engine Deep Web HTML form query expansion
  • 相关文献

参考文献10

  • 1Ghanem T M, Aref W G. Databases Deepen the Web [ J ]. IEEE Computer,2004,73(1) :116 - 117.
  • 2Bergman M K. Deep Web White Paper [ EB/OL ]. 2004. http://brighplanet. com/technology/deepweb.asp.
  • 3Chang K C C, He B, Li C, et al. Structured Databases on the Web: Observations and Implications [ J ]. SIGMOD Record, 2004,33(3) :61 - 70.
  • 4Chang K C C,He B,Zhang Z. Toward Large - Scale Integration: Building a MetaQuerier over Databases on the Web [C]// Proceedings of the Second Conference on Innovative Data Systems Research ( CIDR 2005 ). Asilomar, California: [s. n.],2005:44-55.
  • 5Barbosa L, Freire J. Searching for Hidden- Web Databases [C]//The Eighth International Workshop on the Web and Database (WebDB 2005). Baltimore, MD: [s. n. ] ,2005:1 - 6.
  • 6Barbosa L, Freire J. An Adaptive Crawler for locating Hidden - Web Entry Points[ C]//In Proceedings of the 16th International World Wide Web Conference(WWW 2007). Banff: [s. n. ] ,2007:441 - 450.
  • 7Lage J P,da Silva A S,Golgher P B,et al. Automatic generation of agents for collecting hidden Web pages for data extraction[J ]. Data & Knowledge Engineering, 2004, 49:177 - 196.
  • 8刘伟,孟小峰,孟卫一.DeepWeb数据集成问题研究[R].[出版地不详]:WAMDM实验室,2006:18-34.
  • 9高岭,赵朋朋,崔志明.Deep Web查询接口的自动判定[J].计算机技术与发展,2007,17(5):148-151. 被引量:13
  • 10Bacza - Yates R, Hurtado C, Mendoza M. Query recommendation using query logs in ,search engines[ C] //Current Trends in Database Technology. Berlin, Germany: Springer - Verlag, 2004 : 588 - 596.

二级参考文献5

  • 1Ghanem T M,Aref W G.Databases Deepen the Web[J].IEEE Computer,2004,73(1):116-117.
  • 2Bergman M K.The Deep Web:Surfacing Hidden Value[J/OL].The Journal of Electronic Publishing,2001,7(1)[2001].http://www.press.umich.edu/jep/07-01/bergman.html.
  • 3Sherman C,Price G.The Invisible Web:Uncovering Information Sources Search Engines Can't See[M].New York:Cyber Age Books,2001.
  • 4Bergman M K.Deep Web White Paper[EB/OL].2004.http://brightplanet.com/technology/deepweb.asp.
  • 5Lage J P,da Silva A S,Golgher P B,et al..Automatic generation of agents for collecting hidden Web pages for data extraction[J].Data & Knowledge Engineering,2004,49:177-196.

共引文献12

同被引文献28

  • 1郑冬冬,赵朋朋,崔志明.Deep Web爬虫研究与设计[J].清华大学学报(自然科学版),2005,45(S1):1896-1902. 被引量:28
  • 2黄晓冬.Invisible Web研究综述[J].情报科学,2004,22(9):1144-1148. 被引量:19
  • 3朱靖波,陈文亮.基于领域知识的文本分类[J].东北大学学报(自然科学版),2005,26(8):733-735. 被引量:12
  • 4杨道玲.深网信息资源采集初探[J].图书馆杂志,2006,25(12):19-22. 被引量:12
  • 5高岭,赵朋朋,崔志明.Deep Web查询接口的自动判定[J].计算机技术与发展,2007,17(5):148-151. 被引量:13
  • 6GHANEM T M, AREF W G. Databases deepen the web[J]. IEEE Computer, 2004, 37(1): 116--117.
  • 7CHANG K C C, HE B, LI C, et al. Strucured databases on the web: observations and implications[J]. SIGMOD Record, 2004, 33(3): 61-70.
  • 8JARED C, NICK C, DAVID H. Automated discovery of search interfaces on the web[Z]. Proceedings of the 14th Australasian Database Conference, Adelaide, Australia, 2003.
  • 9LIN Peiguang, XU Ruzhi, HONG Zhimin, et al. Finding the WDB's query interface in deep web automatically[J]. IEEE Computer Society, 2008, 195--200.
  • 10姚增利 袁方 常勇.基于搜索引擎和领域知识的DeepWeb接口发现.计算机科学,2008,35(9):100-102.

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部