摘要
排序学习算法作为信息检索与机器学习的一个交叉领域,越来越受到人们的重视.然而,几乎没有排序学习算法考虑到查询差异的存在.文中查询被建模为多元高斯分布,KL距离被用来度量查询之间的距离,利用谱聚类方法对查询进行聚类,为每个聚类类别训练一个排序函数.实验结果表明经过聚类得到的排序函数需要较少的训练样例,但是它的性能却和没有经过聚类得到的排序函数具有可比性,甚至优于后者.
Learning to rank,the interdisciplinary field of information retrieval and machine learning,draws increasing attention and lots of models are designed to optimize the ranking functions.However,few methods take the differences among the queries into account.In this paper,the queries are modeled as multivariate Gaussian distributions and Kullback-Leibler divergence is adopted as distance measure.The spectral clustering is applied to cluster the queries into several clusters and a ranking function is learned for each cluster.The experimental results show that the ranking functions with clustering are trained with less data,but are comparable to or even outperform the ones without clustering.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2012年第1期118-123,共6页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金(No.60736044
60903107
61073071)
高等学校博士学科点专项科研基金(No.20090002120005)资助项目
关键词
排序学习
排序函数
谱聚类
Learning to Rank
Ranking Function
Spectral Clustering