期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Exploring features for automatic identification of news queries through query logs
1
作者 Xiaojuan ZHANG Jian LI 《Chinese Journal of Library and Information Science》 2014年第4期31-45,共15页
Purpose:Existing researches of predicting queries with news intents have tried to extract the classification features from external knowledge bases,this paper tries to present how to apply features extracted from quer... Purpose:Existing researches of predicting queries with news intents have tried to extract the classification features from external knowledge bases,this paper tries to present how to apply features extracted from query logs for automatic identification of news queries without using any external resources.Design/methodology/approach:First,we manually labeled 1,220 news queries from Sogou.com.Based on the analysis of these queries,we then identified three features of news queries in terms of query content,time of query occurrence and user click behavior.Afterwards,we used 12 effective features proposed in literature as baseline and conducted experiments based on the support vector machine(SVM)classifier.Finally,we compared the impacts of the features used in this paper on the identification of news queries.Findings:Compared with baseline features,the F-score has been improved from 0.6414 to0.8368 after the use of three newly-identified features,among which the burst point(bst)was the most effective while predicting news queries.In addition,query expression(qes)was more useful than query terms,and among the click behavior-based features,news URL was the most effective one.Research limitations:Analyses based on features extracted from query logs might lead to produce limited results.Instead of short queries,the segmentation tool used in this study has been more widely applied for long texts.Practical implications:The research will be helpful for general-purpose search engines to address search intents for news events.Originality/value:Our approach provides a new and different perspective in recognizing queries with news intent without such large news corpora as blogs or Twitter. 展开更多
关键词 Query intent News query News intent Query classification Automaticidentification
下载PDF
Automatic User Goals Identification Based on Anchor Text and Click-Through Data 被引量:5
2
作者 YUAN Xiaojie DOU Zhicheng ZHANG Lu LIU Fang 《Wuhan University Journal of Natural Sciences》 CAS 2008年第4期495-500,共6页
Understanding the underlying goal behind a user's Web query has been proved to be helpful to improve the quality of search. This paper focuses on the problem of automatic identification of query types according to th... Understanding the underlying goal behind a user's Web query has been proved to be helpful to improve the quality of search. This paper focuses on the problem of automatic identification of query types according to the goals. Four novel entropy-based features extracted from anchor data and click-through data are proposed, and a support vector machines (SVM) classifier is used to identify the user's goal based on these features. Experi- mental results show that the proposed entropy-based features are more effective than those reported in previous work. By combin- ing multiple features the goals for more than 97% of the queries studied can be correctly identified. Besides these, this paper reaches the following important conclusions: First, anchor-based features are more effective than click-through-based features; Second, the number of sites is more reliable than the number of links; Third, click-distribution- based features are more effective than session-based ones. 展开更多
关键词 query classification user goals anchor text click-through data information retrieval
下载PDF
Learning Query Ambiguity Models by Using Search Logs 被引量:1
3
作者 宋睿华 窦志成 +1 位作者 洪小文 俞勇 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第4期728-738,共11页
Identifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query.... Identifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query. However, previous work showed user clicks alone are misleading in judging a query as being ambiguous or not. In this paper, we address the problem of learning a query ambiguity model by using search logs. First, we propose enriching a query by mining the documents clicked by users and the relevant follow up queries in a session. Second, we use a text classifier to map the documents and the queries into predefined categories. Third, we propose extracting features from the processed data. Finally, we apply a state-of-the-art algorithm, Support Vector Machine (SVM), to learn a query ambiguity classifier. Experimental results verify that the sole use of click based features or session based features perform worse than the previous work based on top retrieved documents. When we combine the two sets of features, our proposed approach achieves the best effectiveness, specifically 86% in terms of accuracy. It significantly improves the click based method by 5.6% and the session based method by 4.6%. 展开更多
关键词 ambiguous query log mining query classification
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部