期刊文献+

实时垂直搜索引擎对象缓存优化策略

Object cache optimization strategy for real-time vertical search engine
下载PDF
导出
摘要 针对实时垂直搜索引擎搜索对象热门度多变和数据抓取由查询驱动等问题,提出一种全新的实时垂直搜索引擎对象缓存优化策略.基于对象及属性间的关联设计热门对象预测模型,预测热门对象的变化趋势;基于用户查询及对象变化符合泊松过程的特点,推导最大化数据新鲜度的计算方法,从理论上给出资源分配和动态平衡的最优策略.大量的对比实验验证了新的缓存优化策略在较少开销增长的前提下,用户查询结果平均新鲜度和准确率均明显优于传统固定频率的缓存策略. A new vertical search engine object cache optimization strategy was proposed to address the challenges like the changeful of popular objects,the property of query triggered data crawl and so on.A popular object prediction model was proposed based on relationships between objects and their properties in order to predict the tendency of popular object distribution.Since user query and data changed by Poisson process,a procedure to maximize the data freshness and an optimal strategy to distribute and balance resource were proposed.Experimental results show that the increase in time complexity is relative limited,while the average freshness of user query result and the query precision ratio preceded traditional fixed-rate cache strategy.
出处 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2011年第1期14-19,36,共7页 Journal of Zhejiang University:Engineering Science
基金 国家自然科学基金资助项目(60603044 60803003) 浙江省科技计划重大科技攻关项目(2006c11108)
关键词 缓存策略 实时搜索 垂直搜索 搜索引擎 cache strategy real-time search vertical search search engine
  • 相关文献

参考文献11

  • 1WU Y, SHOU L, HU T, et al. Query triggered crawling strategy: build a time sensitive vertical search engine [C]// Proceedings of the 2008 International Conference on Cyberworlds. Hangzhou: IEEE, 2008: 422-427.
  • 2BREWINGTON B E,CYBENKO G. How dynamic is the web [J]. Computer Networks, 2000, 33(1/6): 257- 276.
  • 3BREWINGTON B E, CYBENKO G. Keeping up with the changing web [J]. IEEE Computer, 2000, 33(5): 52 - 58.
  • 4GRIMES C, BRIEN S O. Microscale evolution of Web pages [C]//Proceeding of the 17th International Conference on World Wide Web. Beijing: ACM, 2008: 1149- 1150.
  • 5CHO J, GARCIA-MOLINA H. The evolution of the web and implications for an incremental crawler [C]// Proceedings of the 26th International Conference on Very Large DataBases. San Francisco: Morgan Kaufmann, 2000:200 - 209.
  • 6FETTERLY D, MANASSE M, NAJORK M, et al. A large-scale study of the evolution of web pages [C]//Proceedings of the 12th International Conference on World Wide Web. New York: ACM, 2003:669-678.
  • 7OLSTON C, PANDEY S. Recrawl scheduling based on information longevity [C] // Proceedings of the 17th International World Wide Web Conference. Beijing: ACM, 2008 : 437 - 446.
  • 8CHO J, GARCIA-MOLINA H. Estimating frequency of change [J]. ACM Transactions on Internet Technology, 2003, 3(3): 256-290.
  • 9CHO J, GARCIA-MOLINA H. Effective page refresh policies for Web crawlers [J]. ACM Transactions on Database Systems, 2003, 28(4) : 390 - 426.
  • 10SATO N, EUHARA M, SAKAI Y. FTF-IDF scoring for fresh information retrieval [C]// Proceedings of the 18th International Conference on Advanced Information Networking and Application. [S. l.]: IEEE, 2004:165 - 170.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部