面向搜索引擎的实体推荐综述被引量：11

A Survey of Entity Recommendation in Web Search

下载PDF

导出

摘要面向搜索引擎的实体推荐任务旨在为用户输入的搜索查询推荐出相关实体,从而帮助用户发现感兴趣的实体,提升用户的搜索体验.此外,为了帮助用户更好地理解实体推荐结果,还需要为被推荐的实体集合以及每一个被推荐实体生成恰当且合理的推荐理由.实体推荐能够帮助用户便捷地获得与其搜索需求相关的信息,有助于提升用户的信息发现体验,因此已成为现代搜索引擎中必不可少的功能之一.与传统领域的推荐任务相比较,面向搜索引擎的实体推荐面临更多的挑战,例如搜索查询中实体指称的歧义性以及实体推荐的领域无关性等.针对搜索引擎实体推荐任务的特点与存在的挑战,我们认为构建一个完备的实体推荐系统需要解决如下三个子研究任务:实体链接、实体推荐与推荐理由生成.实体链接任务的目标是将搜索查询中的实体指称消除歧义并链接到知识库中无歧义的实体上,以获得与搜索查询对应的查询实体.实体推荐任务的目标是获取与查询实体相关的实体集合并对其进行排序.为了提供更准确的推荐结果,往往还需要进一步利用历史搜索信息获取用户对实体的偏好并对当前查询进行更好地理解.推荐理由生成任务的目标是为被推荐的实体集合以及每一个被推荐实体生成推荐理由,其中集合推荐理由解释的是该集合中的被推荐实体与查询实体的关系,实体推荐理由则是单个实体被推荐的理由.本文首先介绍面向搜索引擎的实体推荐任务的研究背景与意义、存在的挑战以及各子任务,然后详细介绍每一个子任务存在的技术挑战、研究现状以及解决方法,最后对未来研究方向进行展望并对本文进行总结。 Entity recommendation aims to provide search users with entity suggestions relevant to their information needs, which can help them to explore and discover entities of interest. For this reason, over the past few years, major commercial Web search engines have proactively recommended related entities for a query along with the regular Web search results to enrich and improve the user experience of information retrieval and discovery. To help users better understand why the entities are recommended to them, it is also important to provide explanations for recommendations. The task of building an entity recommendation system presents more challenges than the task of building a traditional item-based recommender system because of the ambiguity of the entities mentioned in queries, the domain-agnostic recommendation methods for Web-scale queries, and the cross-domain recommendation scenarios. To address these challenges, the following three sub-tasks should be studied on building an entity recommendation system in Web search engines. The first is entity linking in queries, which aims to disambiguate the entity mentioned in a query and link it to the corresponding entity in a knowledge base. To improve the entity linking accuracy, an entity linking system should consider additional information such as the query context and a user’s search history. The second is entity recommendation, which aims to find a set of related entities to a query, and then rank these entities. Specifically, an entity recommendation model typically consists of two components: related entity finding and entity ranking. The former extracts a set of candidate entities related to a query that a user is searching for, while the latter ranks the candidate entities according to how well they meet the user’s information need. To better understand a user’s information needs and capture a user’s preferences, an entity recommendation model should exploit additional information such as a user’s search history. There are two kinds of search history: short-term search history in a single session and long-term search history across all sessions. The short-term search history, which consists of in-session preceding queries and clickthrough data, can be exploited to help understand a user’s information needs and capture a user’s interests on entity preference in the current session. The long-term search history includes query history and clickthrough data across all sessions for a period of time, which reflects a user’s interests accumulated over time and could be used to capture the user’s intrinsic interests on entity preference. Therefore, in order to generate more relevant entity recommendations w.r.t. the user’s information needs and preferences, it is important for an entity recommendation model to exploit as many search histories as possible. The third is recommendation captioning, which aims to explain why two entities are related and why a group of entities is recommended to a user. Presenting related entities with plausible explanations can help users quickly figure out the connections between the query and the recommended entities as well as the key facts of these entities, which in turn increases the understandability of the recommendations and user engagement. In this paper, the research background and the challenges of this task are presented first, and then the related studies and methods are introduced. Finally, problems are discussed, and several future research directions are suggested.

作者黄际洲孙雅铭王海峰刘挺 HUANG Ji-Zhou;SUN Ya-Ming;WANG Hai-Feng;LIU Ting(Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, Harbin 150001;Baidu Inc., Beijing 100085)

机构地区哈尔滨工业大学计算机学院社会计算与信息检索研究中心百度公司

出处《计算机学报》 EI CSCD 北大核心 2019年第7期1467-1494,共28页 Chinese Journal of Computers

基金国家“九七三”重点基础研究发展计划项目基金(2014CB340505)资助

关键词搜索引擎实体推荐实体链接推荐理由 search engine entity recommendation entity linking recommendation captioning

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

同被引文献111

1张雪峰,陈秀莉,僧德文.融合用户信任和影响力的top-N推荐算法[J].浙江大学学报（工学版）,2020,54(2):311-319. 被引量：2
2罗自文,熊庾彤,马娅萌.智能媒体的概念、特征、发展阶段与未来走向:一种媒介分析的视角[J].新闻与传播研究,2021,28(S01):59-75. 被引量：42
3王贵东,杨德林.“互联网+交通物流”与人口城镇化:基于“两新一重”融合模型[J].经济学报,2021,8(1):129-158. 被引量：10
4赖奕安,张玉洁,杜雨露,孟祥武.一种基于协同上下文关系学习的同城活动推荐算法[J].软件学报,2020,31(2):421-438. 被引量：5
5燕彩蓉,黄颜,徐光伟,黄永锋.基于时间动态性的场感知分解机模型[J].控制与决策,2020,35(1):169-173. 被引量：2
6杨思春.一种改进的句子相似度计算模型[J].电子科技大学学报,2006,35(6):956-959. 被引量：34
7钱丽萍,汪立东.基于中心短语及权值的相似度计算[J].郑州大学学报（理学版）,2007,39(2):149-152. 被引量：6
8吴莹,卢雨霞,陈家建,王一鸽.跟随行动者重组社会--读拉图尔的《重组社会：行动者网络理论》[J].社会学研究,2008(2):218-234. 被引量：233
9苗夺谦,王珏.粗糙集理论中知识粗糙性与信息熵关系的讨论[J].模式识别与人工智能,1998,11(1):34-40. 被引量：138
10范聪贤,徐汀荣,范强贤.Web结构挖掘中HITS算法改进的研究[J].微计算机信息,2010,26(3):160-162. 被引量：11

引证文献11

1李茂胜,王天一.基于多特征融合的羊养殖问句相似度评价方法[J].智能计算机与应用,2021,11(12):22-27.
2陈文彬,孙杰.基于Spark下的隐语义模型研究[J].价值工程,2021,40(15):182-184.
3王保.基于项目权重的协同过滤推荐算法研究[J].信息与电脑,2021,33(11):79-81.
4李剑宇,岳昆.知识图谱中的关联实体发现[J].云南大学学报（自然科学版）,2021,43(6):1079-1085.
5陈玉明,董建威.基于粒计算的非线性感知机[J].数据采集与处理,2022,37(3):566-575. 被引量：1
6刘滨,孙中贤,吕梓逸,孟宪达,陈莉,詹世源.智慧交通互联网态势感知平台研究[J].河北科技大学学报,2022,43(6):651-660. 被引量：2
7王嵘冰,刘鹤.融入页面跳出率的权威页面鉴别算法[J].辽宁大学学报（自然科学版）,2022,49(4):307-313.
8张闯,王巍,杜雨晅,郑小丽.融合上下文信息与自注意力机制的群组推荐算法[J].计算机应用研究,2023,40(2):498-503. 被引量：1
9王井,孙涵阅.不可见的行动者:助农短视频的算法分发策略及双价值逻辑[J].现代传播（中国传媒大学学报）,2023,45(4):127-134. 被引量：3
10暴琳,朱志宇,孙晓燕,徐标.面向多源异构数据的个性化搜索和推荐算法综述[J].控制理论与应用,2024,41(2):189-209.

二级引证文献7

1白瑞.基于智慧交通的超限超载治理措施[J].运输经理世界,2023(28):58-60.
2马艮娟,刘梅.大数据在计算机软件工程中的应用探讨[J].软件,2023,44(9):162-164. 被引量：2
3朱欣娟,熊依伦.融合时间序列特征的群组推荐模型[J].西安工程大学学报,2024,38(1):105-112.
4王欣羽,丁汉青.媒介社会化背景下的传媒经济研究——2023年传媒经济文献综述[J].教育传媒研究,2024(1):31-38.
5刘瑞一,王长潇,杨立奇.乡村青年短视频翻拍叙事策略与文化意义[J].青年记者,2024(6):104-108.
6孔丽茹,陈玉明,傅兴宇,江海亮,许进程.基于旋转粒化的逻辑回归算法[J].计算机应用研究,2024,41(8):2398-2403.
7刘楠.区域视角下“新农人”网红媒介实践的组织团结与可持续发展[J].新闻与传播研究,2024,31(7):35-53.

1吴美花.3000万"蚂蚁雄师"要做全世界的生意[J].浙商,2019,0(13):86-86.
2王宏俊,王岩松,王进,汪胡青.基于Rasch模型的身份认证技术研究[J].活力,2018,0(24):218-219.
3国家卫健委:危害医疗秩序失信行为人,近半数涉倒卖医院号源[J].现代养生,2019,0(12):2-2.
4徐丽华.探讨高中历史教学中如何培养学生的人文素养[J].中国校外教育,2019(20):61-61. 被引量：3
5吴英男,陈飞.智慧城市中海量时空大数据的级联更新方法[J].电子世界,2019,0(8):17-20.
6任璐.基于用户和搜索体验的英文阅读书目搜索任务[J].信息技术,2019,43(5):134-137. 被引量：1
7徐筱,唐赛军.基于ASP技术的教学资源共享平台设计与实现[J].中国教育信息化,2019,25(7):54-56. 被引量：3
8卓贵清.探析绿色理念在机械设计与制造中的应用[J].汽车世界,2019,0(8):100-100.
9魏聪,刘春瑞,王峰.过程异常工况巡检预警系统研究[J].石油化工安全环保技术,2019,35(2):35-38.
10张鹏.“互联网+”下财务会计发展的策略和建议探讨[J].财经界,2019,0(17):130-130. 被引量：3

计算机学报

2019年第7期

浏览历史

内容加载中请稍等...

面向搜索引擎的实体推荐综述被引量：11

同被引文献111

引证文献11

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

面向搜索引擎的实体推荐综述 被引量：11

同被引文献111

引证文献11

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

面向搜索引擎的实体推荐综述被引量：11