期刊文献+

Optimizing top-k retrieval: submodularity analysis and search strategies 被引量:1

Optimizing top-k retrieval: submodularity analysis and search strategies
原文传递
导出
摘要 The key issue in top-k retrieval, finding a set of k documents (from a large document collection) that can best answer a user's query, is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k re- trieval problem in the framework of facility location analysis and prove he submodularity of that objective function which provides a theoretical approximation guarantee of factor 1 -1/ε for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first ob- tains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach can supersede the existing ones effectively and efficiently. The key issue in top-k retrieval, finding a set of k documents (from a large document collection) that can best answer a user's query, is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k re- trieval problem in the framework of facility location analysis and prove he submodularity of that objective function which provides a theoretical approximation guarantee of factor 1 -1/ε for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first ob- tains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach can supersede the existing ones effectively and efficiently.
出处 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第3期477-487,共11页 中国计算机科学前沿(英文版)
基金 This work was supported by the National Natural Science Foundation of China (Grant Nos. 61572135 and 61170085), 973 project (2010CB328106), Program for New Century Excellent Talents in China (NCET-10-0388).
关键词 top-k retrieval DIVERSIFICATION submodular function maximization top-k retrieval, diversification, submodular function maximization
  • 相关文献

参考文献25

  • 1Manning C D, Raghavan P, Schiitze H. Introduction to Information Re- trieval. Cambridge: Cambridge University Press, 2008.
  • 2Chen H, Karger D R. Less is more: probabilistic models for retriev- hag fewer relevant documents. In: Proceedings of the 29th Annual In- ternational ACM SIGIR Conference on Research and Development ha Information Retrieval. 2006, 429-436.
  • 3Carbonell J G, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development ha Information Retrieval. 1998, 335-336.
  • 4Zhai C, Cohen W W, Lafferty J D. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th Annal International ACM SIGIR Conference on Research and Development in Information Retrieval. 2003, 10-17.
  • 5Wang J, Zhu J. Portfolio theory of information retrieval. In: Proceed- hags of the 32rid International ACM SIGIR Conference on Research and Development in Information Retrieval. 2009, 115-122.
  • 6Zuccon G, Azzopardi L. Using the quantum probability ranking prin- ciple to rank interdependent documents. In: Proceedings of the 32nd European Conference on Information Retrieval Research. 2010, 357- 369.
  • 7Chandar P, Carterette B. Diversification of search results using web- graphs. In: Proceedings of the 33rd International ACM SIGIR Con- ference on Research and Development ha Information Retrieval. 2010, 869-870.
  • 8Santos R L T, Macdonald C, Ounis I. Intent-aware search result diver- sification. In: Proceedings of the 34th International ACM SIGIR Con- ference on Research and Development in Information ReWieval. 2011, 595-604.
  • 9Zuccon G, Azzopardi L, Zhang D,Wang J. Top-k retrieval using facility location analysis. In: Proceedings of the 34th European Conference on Information Retrieval Research. 2012, 305-316.
  • 10Gonzalez T E Handbook of Approximation Algorithms and Meta- heuristics. Boca Raton: CRC Press, 2007.

同被引文献1

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部