摘要
排序学习是机器学习与信息检索相互结合的研究领域,它利用机器学习的方法自动调节参数、综合多种排序特征、同时可以避免过拟合,进而得到新的排序模型用于排序被检索的文档.在排序学习方法中,Listwise方法的排序效果相对较好,但是目前已有的属于此类学习算法也有很多缺点:由于是基于列表所有的置换进行训练,时间复杂度太高;其损失函数并未充分利用极其重要的排序位置信息.本文基于此提出了新的学习算法,引入了位置信息损失因子,构建了新的损失函数,同时使用了效率更高的训练方法.最后在LETOR 4.0数据集上的实验结果表明,新学习算法的排序性能得到了较为明显的提升.
Learning-to-rank has become a popular research area at the intersection of machine learning,and information retrieval. It u- ses methods of machine learning to automatically tune parameters, combine features for ranking, avoid over-fitting. The performance of the listwise ranking algorithms are. in general better than other ranking algorithms. But the listwise approach also has certain aspects to improve. For example,the training complexities of some listwise algorithms are high since the evaluation of their loss function are per- mutation based. Moreover the position information has not been fully utilized. In this paper, we are not only to choose a more efficient learning algorithm, but also improve loss function by introducing position discount factors. The experimental results onpublicly availa- ble LETOR data show that the new algorithm is competitivewith state-of-the-art algorithms.
出处
《小型微型计算机系统》
CSCD
北大核心
2017年第1期20-23,共4页
Journal of Chinese Computer Systems