摘要
在网络无限扩张的同时,网页也在频繁地变化,搜索引擎往往要定期更新它所检索的网页,需耗费大量时间和系统资源,因此提高更新效率是搜索引擎技术的关键。文章比较了目前存在的两种更新方法:统一更新方法和个体更新方法,指出两种方法优劣所在,提出一种改进的基于B ayes分类的网页更新方法。
The Web is huge and the Web pages are updated frequently. The index maintained by a search engine has to refresh Web pages periodically. This is extremely time and resource consuming because the search engine needs to crawl the Web and download Web pages to refresh its index. Therefore, improving the refresh efficiency is the key technology of the search engine. This paper compares uniform refresh policy and proportional refresh policy, and points out their advantages and disadvantages. Finally, this paper presents a reformed method called classified refresh policy based on Bayes Theory.
出处
《交通与计算机》
2005年第5期63-65,共3页
Computer and Communications
关键词
搜索引擎
更新度
更新策略
search engine
freshness
refresh policy