期刊文献+

Automatic User Goals Identification Based on Anchor Text and Click-Through Data 被引量:5

Automatic User Goals Identification Based on Anchor Text and Click-Through Data
下载PDF
导出
摘要 Understanding the underlying goal behind a user's Web query has been proved to be helpful to improve the quality of search. This paper focuses on the problem of automatic identification of query types according to the goals. Four novel entropy-based features extracted from anchor data and click-through data are proposed, and a support vector machines (SVM) classifier is used to identify the user's goal based on these features. Experi- mental results show that the proposed entropy-based features are more effective than those reported in previous work. By combin- ing multiple features the goals for more than 97% of the queries studied can be correctly identified. Besides these, this paper reaches the following important conclusions: First, anchor-based features are more effective than click-through-based features; Second, the number of sites is more reliable than the number of links; Third, click-distribution- based features are more effective than session-based ones. Understanding the underlying goal behind a user's Web query has been proved to be helpful to improve the quality of search. This paper focuses on the problem of automatic identification of query types according to the goals. Four novel entropy-based features extracted from anchor data and click-through data are proposed, and a support vector machines (SVM) classifier is used to identify the user's goal based on these features. Experi- mental results show that the proposed entropy-based features are more effective than those reported in previous work. By combin- ing multiple features the goals for more than 97% of the queries studied can be correctly identified. Besides these, this paper reaches the following important conclusions: First, anchor-based features are more effective than click-through-based features; Second, the number of sites is more reliable than the number of links; Third, click-distribution- based features are more effective than session-based ones.
出处 《Wuhan University Journal of Natural Sciences》 CAS 2008年第4期495-500,共6页 武汉大学学报(自然科学英文版)
基金 the Tianjin Applied Fundamental Research Plan (07JCYBJC14500)
关键词 query classification user goals anchor text click-through data information retrieval query classification user goals anchor text click-through data information retrieval
  • 相关文献

参考文献11

  • 1Yiming Yang.An Evaluation of Statistical Approaches to Text Categorization[J].Information Retrieval (-).1999(1-2)
  • 2Broder A A.Taxonomy of Web Search[].SIGIR Forum.2002
  • 3Rose D E,Levinson D.Understanding User Goals in Web Search[].Proceedings of the th International Conference on World Wide Web.2004
  • 4Lee U,Liu Z,Cho J.Automatic Identification of User Goals in Web Search[].Proceedings of the th International Conference on World Wide Web.2005
  • 5Craswell N,Hawking D,Robertson S.Effective Site Finding using Link Anchor Information[].Proceedings of ACM SIGIR’.2001
  • 6Westerveld T,Kraaij W,Hiemstra D.Retrieving Web Pages Using Content, Links, URLs and Anchors[].Proceedings of the Tenth Text Retrieval Conference (TREC-).2001
  • 7Kang I,Kim G.Query Type Classification for Web Docu- ment Retrieval[].Proceedings of the th Annual Interna- tional ACM SIGIR Conference on Research and Development in Information Retrieval.2003
  • 8Liu Yiqun,Zhang Min,Ru Liyun, et al.Automatic Query Type Identification Based on Click Through Information[].Lecture Notes in Computer Science.2006
  • 9Vapnik V.Principles of Risk Minimization for Learning The- ory[].Advances in Neural Information Processing Systems.1992
  • 10Jain A K,Zongker D.Feature-Selection: Evaluation, Applica- tion, and Small Sample Performance[].IEEE Transactions on Pattern Analysis and Machine Intelligence.1997

同被引文献115

  • 1张鹏飞,李赟,刘建毅,钟义信.基于相对词频的文本特征抽取方法[J].计算机应用研究,2005,22(4):23-26. 被引量:9
  • 2Kang I,Kim G. Query type classification for Web document retrieval[ C ]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2003:64-71.
  • 3Broder A. Ataxonomy of Web search[J]. SIGIR Forum, 2002, 36(2) : 3 -10.
  • 4Rose D E,Levinson D. Understanding user goals in Web search [C]//WWW 2004: Proceedings of the 13th International Conference on World Wide Web, 2004:13 - 19.
  • 5Marchionini G. Exploratorysearch: From finding to understanding[J]. Communications of the ACM, 2006, 49(4) : 41-46.
  • 6Lee U,Liu Z, Cho J. Automatic identification of user goals in Web search [ C ]//WWW 2005 : Proceedings of the 14th International Conference on World Wide Web, 2005:391-401.
  • 7Mendoza M,Ricardo Baeza-Yates. A Web search analysis considering the intention behind queries[ C ]//LA-WEB 20-: Proceedings of the Latin American Web Conference, 2008:66-74.
  • 8Waller V. Not just information: Who searches for what on the search engine Google?[ J ]. Journal of the American Society for Information Science and Technology, 2011,62(4) : 761 -775.
  • 9Lux M,Kofler C,Marques O. A classification scheme for user intentions in image search [ C ]//Proceedings of the 28th International Conference Extended Abstracts on Human Factors in Computing Systems, 2010:3913 -3918.
  • 10Kofler C. An exploratory study on the explicitness of user intentions in digital photo retrieval[ C]//Proceedings of the 9th I-KNOW and I-SEMANTICS, 2009:208 -214.

引证文献5

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部