摘要
面向搜索引擎的实体推荐任务旨在为用户输入的搜索查询推荐出相关实体,从而帮助用户发现感兴趣的实体,提升用户的搜索体验.此外,为了帮助用户更好地理解实体推荐结果,还需要为被推荐的实体集合以及每一个被推荐实体生成恰当且合理的推荐理由.实体推荐能够帮助用户便捷地获得与其搜索需求相关的信息,有助于提升用户的信息发现体验,因此已成为现代搜索引擎中必不可少的功能之一.与传统领域的推荐任务相比较,面向搜索引擎的实体推荐面临更多的挑战,例如搜索查询中实体指称的歧义性以及实体推荐的领域无关性等.针对搜索引擎实体推荐任务的特点与存在的挑战,我们认为构建一个完备的实体推荐系统需要解决如下三个子研究任务:实体链接、实体推荐与推荐理由生成.实体链接任务的目标是将搜索查询中的实体指称消除歧义并链接到知识库中无歧义的实体上,以获得与搜索查询对应的查询实体.实体推荐任务的目标是获取与查询实体相关的实体集合并对其进行排序.为了提供更准确的推荐结果,往往还需要进一步利用历史搜索信息获取用户对实体的偏好并对当前查询进行更好地理解.推荐理由生成任务的目标是为被推荐的实体集合以及每一个被推荐实体生成推荐理由,其中集合推荐理由解释的是该集合中的被推荐实体与查询实体的关系,实体推荐理由则是单个实体被推荐的理由.本文首先介绍面向搜索引擎的实体推荐任务的研究背景与意义、存在的挑战以及各子任务,然后详细介绍每一个子任务存在的技术挑战、研究现状以及解决方法,最后对未来研究方向进行展望并对本文进行总结。
Entity recommendation aims to provide search users with entity suggestions relevant to their information needs, which can help them to explore and discover entities of interest. For this reason, over the past few years, major commercial Web search engines have proactively recommended related entities for a query along with the regular Web search results to enrich and improve the user experience of information retrieval and discovery. To help users better understand why the entities are recommended to them, it is also important to provide explanations for recommendations. The task of building an entity recommendation system presents more challenges than the task of building a traditional item-based recommender system because of the ambiguity of the entities mentioned in queries, the domain-agnostic recommendation methods for Web-scale queries, and the cross-domain recommendation scenarios. To address these challenges, the following three sub-tasks should be studied on building an entity recommendation system in Web search engines. The first is entity linking in queries, which aims to disambiguate the entity mentioned in a query and link it to the corresponding entity in a knowledge base. To improve the entity linking accuracy, an entity linking system should consider additional information such as the query context and a user’s search history. The second is entity recommendation, which aims to find a set of related entities to a query, and then rank these entities. Specifically, an entity recommendation model typically consists of two components: related entity finding and entity ranking. The former extracts a set of candidate entities related to a query that a user is searching for, while the latter ranks the candidate entities according to how well they meet the user’s information need. To better understand a user’s information needs and capture a user’s preferences, an entity recommendation model should exploit additional information such as a user’s search history. There are two kinds of search history: short-term search history in a single session and long-term search history across all sessions. The short-term search history, which consists of in-session preceding queries and clickthrough data, can be exploited to help understand a user’s information needs and capture a user’s interests on entity preference in the current session. The long-term search history includes query history and clickthrough data across all sessions for a period of time, which reflects a user’s interests accumulated over time and could be used to capture the user’s intrinsic interests on entity preference. Therefore, in order to generate more relevant entity recommendations w.r.t. the user’s information needs and preferences, it is important for an entity recommendation model to exploit as many search histories as possible. The third is recommendation captioning, which aims to explain why two entities are related and why a group of entities is recommended to a user. Presenting related entities with plausible explanations can help users quickly figure out the connections between the query and the recommended entities as well as the key facts of these entities, which in turn increases the understandability of the recommendations and user engagement. In this paper, the research background and the challenges of this task are presented first, and then the related studies and methods are introduced. Finally, problems are discussed, and several future research directions are suggested.
作者
黄际洲
孙雅铭
王海峰
刘挺
HUANG Ji-Zhou;SUN Ya-Ming;WANG Hai-Feng;LIU Ting(Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, Harbin 150001;Baidu Inc., Beijing 100085)
出处
《计算机学报》
EI
CSCD
北大核心
2019年第7期1467-1494,共28页
Chinese Journal of Computers
基金
国家“九七三”重点基础研究发展计划项目基金(2014CB340505)资助
关键词
搜索引擎
实体推荐
实体链接
推荐理由
search engine
entity recommendation
entity linking
recommendation captioning