期刊文献+

Learning Query Ambiguity Models by Using Search Logs 被引量:1

Learning Query Ambiguity Models by Using Search Logs
原文传递
导出
摘要 Identifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query. However, previous work showed user clicks alone are misleading in judging a query as being ambiguous or not. In this paper, we address the problem of learning a query ambiguity model by using search logs. First, we propose enriching a query by mining the documents clicked by users and the relevant follow up queries in a session. Second, we use a text classifier to map the documents and the queries into predefined categories. Third, we propose extracting features from the processed data. Finally, we apply a state-of-the-art algorithm, Support Vector Machine (SVM), to learn a query ambiguity classifier. Experimental results verify that the sole use of click based features or session based features perform worse than the previous work based on top retrieved documents. When we combine the two sets of features, our proposed approach achieves the best effectiveness, specifically 86% in terms of accuracy. It significantly improves the click based method by 5.6% and the session based method by 4.6%. Identifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query. However, previous work showed user clicks alone are misleading in judging a query as being ambiguous or not. In this paper, we address the problem of learning a query ambiguity model by using search logs. First, we propose enriching a query by mining the documents clicked by users and the relevant follow up queries in a session. Second, we use a text classifier to map the documents and the queries into predefined categories. Third, we propose extracting features from the processed data. Finally, we apply a state-of-the-art algorithm, Support Vector Machine (SVM), to learn a query ambiguity classifier. Experimental results verify that the sole use of click based features or session based features perform worse than the previous work based on top retrieved documents. When we combine the two sets of features, our proposed approach achieves the best effectiveness, specifically 86% in terms of accuracy. It significantly improves the click based method by 5.6% and the session based method by 4.6%.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第4期728-738,共11页 计算机科学技术学报(英文版)
关键词 ambiguous query log mining query classification ambiguous query, log mining, query classification
  • 相关文献

参考文献21

  • 1Song R, Luo Z, Nie J Y, Yu Y, Hon H W. Identification of ambiguous queries in Web search. Information Processing and Management, 2008, 45(2): 216-229.
  • 2Dou Z, Song R, Wen J R. A large-scale evaluation and analysis of personalized search strategies. In Proc. the 16th International Conference on World Wide Web (WWW2007), Banff, Canada, May 8-12, 2007, pp.581-590.
  • 3Sanderson M. Ambiguous queries: Test collections need more sense. In Proc. the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2008), Singapore, July 20-24, 2008, pp.499- 506.
  • 4Radlinski F, Dumais S. Improving personalized Web search using result divcrsification. In Proc. the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), Seattle, USA, Aug. 6-11, 2006, pp.691-692.
  • 5Li Y, Zheng Z, Dai K. KDD CUP-2005 report: Facing a great challenge. SIGKDD Explor. Newsl., 2005, 7(2): pp.91-99.
  • 6Vapnik V N. Principles of Risk Minimization for Learning Theory. Advances in Neural Information Processing Systems 4, Morgan Kaufmann, 1992, pp.831-838.
  • 7Mihalcea R, Pedersen T. Advances in word sense disambiguation. In Tutorials at the 20th National Conference on Artificial Intelligence, Pittsburgh, USA, July 9-13, 2005.
  • 8Krovetz R, Croft B W. Lexical ambiguity and information retrieval. ACM Trans. Inf. Syst. 1992, 10(2): 115-141.
  • 9Voorhees E M. Using WordNet to disambiguate word senses for text retrieval. In Proc. the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1993), Pittsburgh, USA, June 27-July 1, 1993, pp.171-180.
  • 10Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proc. the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1998), Melbourne, Australia, Aug. 24- 28, 1998, pp.335-336.

同被引文献5

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部