Purpose:In this paper,we attempt to use query refinements to identify users' search intents and seek a method for intent clustering based on real world query data.Design/methodology/approach:An experiment has been...Purpose:In this paper,we attempt to use query refinements to identify users' search intents and seek a method for intent clustering based on real world query data.Design/methodology/approach:An experiment has been conducted to analyze selected search sessions from the American Online(AOL) query logs with a two-stage approach.The first stage is to identify underlying intent by combining query co-occurrence information with query expression similarity.The work in the second stage is to cluster identified results by constructing query vectors through performing random walks on a Markov graph.Findings:Average correctness for identifying search intent is 0.74.Precision,recall,F-score values for intent clustering are 0.73,0.72 and 0.71,respectively.The results indicate that combining session co-occurrence information and query expression similarity can further filter noises and our clustering method is more suitable for sparse data.Research limitations:We use the time-out threshold(15-minutc) method to group queries in one session,but a user may have multiple search goals at the same time and the multi-task behavior of a user is hard to capture in a session defined based on time notions.Practical implications:This study provides insights into the ways of understanding users' search intents by analyzing their queries and refinements from a new perspective.The results will help search engine developers to identify user intents.Originality/value:We propose a new method to identify users' search intents by combining session co-occurrence information and query expression similarity,and a new method for clustering sparse data.展开更多
Literature searches on the Web result in great volumes of query results. A model is presented here to refine the search process using user interests. User interests are analyzed to calculate semantic similarity among ...Literature searches on the Web result in great volumes of query results. A model is presented here to refine the search process using user interests. User interests are analyzed to calculate semantic similarity among the interest terms to refine the query. Traditional general purpose similarity measures may not always fit a domain specific context. This paper presents a similarity method for medical literature searches based on the biomedical literature knowledge source "MEDLINE", the normalized MEDLINE distance, to more reasonably reflect the relevance between medical terms. This measure gives more accurate user interest descriptions through calculating the similarities of user interest terms to rerank the interest term list. The accurate user interest descriptions can be used for query refinement in keyword searches to give more personalized results for the user. This measure also improves the search results for personalization through controlling the return number of results on each topic of interest.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.:71173164)the National Key Technology R&D Program of the Ministry of Science and Technology of China(GrantNo.:2012BAH33F03)
文摘Purpose:In this paper,we attempt to use query refinements to identify users' search intents and seek a method for intent clustering based on real world query data.Design/methodology/approach:An experiment has been conducted to analyze selected search sessions from the American Online(AOL) query logs with a two-stage approach.The first stage is to identify underlying intent by combining query co-occurrence information with query expression similarity.The work in the second stage is to cluster identified results by constructing query vectors through performing random walks on a Markov graph.Findings:Average correctness for identifying search intent is 0.74.Precision,recall,F-score values for intent clustering are 0.73,0.72 and 0.71,respectively.The results indicate that combining session co-occurrence information and query expression similarity can further filter noises and our clustering method is more suitable for sparse data.Research limitations:We use the time-out threshold(15-minutc) method to group queries in one session,but a user may have multiple search goals at the same time and the multi-task behavior of a user is hard to capture in a session defined based on time notions.Practical implications:This study provides insights into the ways of understanding users' search intents by analyzing their queries and refinements from a new perspective.The results will help search engine developers to identify user intents.Originality/value:We propose a new method to identify users' search intents by combining session co-occurrence information and query expression similarity,and a new method for clustering sparse data.
基金Supported by the European Commission under the 7th Framework Programme,the Large Knowledge Collider (LarKC) Project (No.FP7-215535)
文摘Literature searches on the Web result in great volumes of query results. A model is presented here to refine the search process using user interests. User interests are analyzed to calculate semantic similarity among the interest terms to refine the query. Traditional general purpose similarity measures may not always fit a domain specific context. This paper presents a similarity method for medical literature searches based on the biomedical literature knowledge source "MEDLINE", the normalized MEDLINE distance, to more reasonably reflect the relevance between medical terms. This measure gives more accurate user interest descriptions through calculating the similarities of user interest terms to rerank the interest term list. The accurate user interest descriptions can be used for query refinement in keyword searches to give more personalized results for the user. This measure also improves the search results for personalization through controlling the return number of results on each topic of interest.