摘要
针对学术搜索引擎的使用、查询和检索模型尚待深入研究的问题,研究了由学术搜索引擎接收的查询的分布,并且提出了一种查询识别方法。文中分析了学术搜索查询,并将其分为导航查询和信息查询。将导航查询限定为用户寻找特定学术文档的查询,在此条件下,通过引入一组新特征的机器学习方法来识别此类的查询,采用梯度提高树(GBT)来训练识别导航查询的分类器,结果显示在召回率为0.68的条件下,准确率为0.68,并且获得了0.677的F评分。
This paper investigates the distribution of queries received by academic search engines and presents a method of query recognition for the problem that academic search engine usage, query and retrieval models are not well studied. This paper studies the academic search queries and divides them into navigation queries and information queries. In this paper, the navigation query is defined as a query to find a specific academic document. Under this condition, a new set of machine learning methods is introduced to identify the query. The Gradient Boosted Trees (GBT) is used to train the classifiers , The results showed that the recall was 0.68, the precision was 0.68, and the F score of 0. 677 was obtained.
出处
《电子科技》
2016年第12期142-144,147,共4页
Electronic Science and Technology
关键词
学术搜索
导航
查询
机器学习
academic search
navigation
query
machine learning