期刊文献+

搜索引擎中基于Bayes分类的网页更新研究

Classified Refresh Policy of Web Pages in Search Engine Based on Bayes Theory
下载PDF
导出
摘要 在网络无限扩张的同时,网页也在频繁地变化,搜索引擎往往要定期更新它所检索的网页,需耗费大量时间和系统资源,因此提高更新效率是搜索引擎技术的关键。文章比较了目前存在的两种更新方法:统一更新方法和个体更新方法,指出两种方法优劣所在,提出一种改进的基于B ayes分类的网页更新方法。 The Web is huge and the Web pages are updated frequently. The index maintained by a search engine has to refresh Web pages periodically. This is extremely time and resource consuming because the search engine needs to crawl the Web and download Web pages to refresh its index. Therefore, improving the refresh efficiency is the key technology of the search engine. This paper compares uniform refresh policy and proportional refresh policy, and points out their advantages and disadvantages. Finally, this paper presents a reformed method called classified refresh policy based on Bayes Theory.
作者 赵新慧
出处 《交通与计算机》 2005年第5期63-65,共3页 Computer and Communications
关键词 搜索引擎 更新度 更新策略 search engine freshness refresh policy
  • 相关文献

参考文献4

二级参考文献10

  • 1[1]Heydon A, Najork M. Mercator: A scalable, extensible Web Crawler[J]. World Wide Web, 1999, 2(4):219-229.
  • 2[2]Pinkerton B. Web Crawler: Finding what people want [D]. Washington: University of Washington, 2000.
  • 3[3]Fredkin E. Trie memory [J]. Communication of ACM, 1960, 26(3):490-500.
  • 4[4]IETF. Robot Exclusion Protocol [EB/OL]. http://www. robotstxt. org/wc/exclusion. html, 2001-10.
  • 5[5]Brin S, Page L. the anatomy of a large-scale hypertexual web search engine [A]. Proceeding of the WWW7 Conference [C]. Australia: Elsevier, 1998.107-117.
  • 6Krishna B, Monika R H. Improved algorithms for topic distillation in a hyperlinked environment [A].Proc of ACM SIGIR'98 Conf on Research and Development in Information Retrieval [ C ].Melbourne :SIGIR, 1998. 221- 229.
  • 7Jeffrey D, Monika R H. Finding related pages in the World Wide Web [J]. Computer Networks, 1999,31:82-95.
  • 8Hersovici M, Jacovi M, Maarek Y, et al. The shark-search algorithm-an application: tailored Web site mapping[J]. Computer Networks and ISDN System, 1998, 30:102-118.
  • 9Sougata Mukherjea. WTMS: a system for collecting and analyzing topic-specific Web information [J].Computer Networks, 2000, 33: 48- 60.
  • 10Brin S, Page L. The anatomy of a large-scale hyper textual Web-search engine [A]. Proc 7th International World Wide Web Conference [C]. Brisbane:SIGIR, 1998. 146-164.

共引文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部