摘要
搜索提示自动补全是正式提交搜索之前,影响用户输入搜索内容的关键手段之一,是商业搜索引擎不可或缺的核心功能之一。如何提供更好的提示词,是一个排序问题。在机器学习排序领域,收集的训练数据有位置偏差,且会影响训练模型的排序效果,已经是一个较为普遍的认知。针对以上训练数据有偏问题,对位置偏差和相关度使用深度学习分别建模,并结合改进后的上下文语义特征,新设计一种同时学习位置偏差和提示词相关度的深度学习排序算法(An Unbiased Deep Learning To Rank Algorithm for Suggestion Auto-completion,UDLTR-SAc)提升搜索提示自动补全的排序效果。UDLTR-SAc能自动学习训练数据中由于位置引入的偏差,从而学习到更为准确的相关度计算模型,在与没有考虑有偏问题的同类型算法及经典补全排序算法对比上分别获得显著增长;同时,在线上A/B测试上也获得+0.1%(p<0.1)的GMV增长。
Suggestion auto-completion is one of the key means to influence users’input before searching submission,and it is one of the indispensable core functions of commercial search engines.How to provide better suggestion words is also a ranking pro-blem.In the field of machine learning ranking,it has been a common perception that the collected training data has position bias[1-8]which can affect the ranking effect of a training model.To address the above problem of biased training data,this paper combines improved context-based semantic feature to design an unbiased deep learning to ranking algorithm for suggestion auto-completion(UDLTR-SAc)which learns position bias and suggestion relevance simultaneously.According to offline experiments and online A/B tests,UDLTR-SAc can automatically learn the training data bias introduced by the position to obtain a more accurate model in calculating correlation when compared with the similar algorithm without considering the bias problem or the classical completion ranking algorithm respectively.What’s more,it also achieves a 0.1%(p<0.1)increase in GMV on the online A/B tests.
作者
周明星
闫湘洲
于敬
高昌举
陈运文
纪达麒
金克
ZHOU Mingxing;YAN Xiangzhou;YU Jing;GAO Changju;CHEN Yunwen;JI Daqi;JIN Ke(Datagrand Co.,Ltd.,Shanghai 200120,China)
出处
《计算机科学》
CSCD
北大核心
2023年第S02期681-685,共5页
Computer Science
基金
上海市科学技术委员会“科技创新行动计划”青年科技启明星计划资助项目(21QB1400100)。
关键词
位置偏差
深度学习
LTR
提示词
自动补全
上下文语义
Position bias
Deep learning
Learning to rank(LTR)
Suggestion
Auto-completion
Context-based semantic