代价敏感的列表排序算法被引量：3

Cost-Sensitive Listwise Ranking Approach

下载PDF

导出

摘要排序学习是信息检索与机器学习中的研究热点之一.在信息检索中,预测排序列表中顶部排序非常重要.但是,排序学习中一类经典的排序算法——列表排序算法——无法强调预测排序列表中顶部排序.为了解决此问题,将代价敏感学习的思想融入到列表排序算法中,提出代价敏感的列表排序算法框架.该框架是在列表排序算法的损失函数中对文档引入权重,且基于性能评价指标NDCG计算文档的权重.在此基础之上,进一步证明了代价敏感的列表排序算法的损失函数是NDCG损失的上界.为了验证代价敏感的列表排序算法的有效性,在此框架下提出了一种代价敏感的ListMLE排序算法,并对该算法开展序保持与泛化性的理论研究工作,从理论上验证了该算法具有序保持特性.在基准数据集上的实验结果表明,在预测排序列表中顶部排序中,代价敏感的ListMLE比传统排序学习算法能取得更好的性能. Learning to rank is a popular research area in machine learning and information retrieval （IR）. In IR, the ranking order on the top of the ranked list is very important. However, listwise approach, a kind of classical approach in learning to rank, cannot emphasize the ranking order on the top of the ranked list. To solve the problem, the idea of cost-sensitive learning is brought into the listwise approach, and a framework for cost-sensitive listwise approach is established. The framework imposes weights for the documents in the listwise loss functions. The weights are computed based on evaluation measure. Normalized Discounted Cumulative Gain （NDCG）. It is proven that the losses of cost-sensitive listwise approaches are the upper bound of the NDCG loss. As an example, a cost- sensitive ListMLE method is proposed. Moreover, the theoretical analysis is conducted on the order preservation and generalization of cost-sensitive ListMLE. It is proven that the loss function of cost- sensitive ListMLE is order preserved. Experimental results on the benchmark datasets indicate that the cost-sensitive ListMLE achieves higher ranking performance than the baselines on the top of the ranked list.

作者卢敏黄亚楼谢茂强王扬刘杰廖振

机构地区南开大学信息技术科学学院南开大学软件学院

出处《计算机研究与发展》 EI CSCD 北大核心 2012年第8期1738-1746,共9页 Journal of Computer Research and Development

基金高等学校博士学科点专项科研基金项目(20100031110096) 中央高校基本科研业务费专项基金项目(65010571) 国家自然科学基金项目(61105049)

关键词排序学习列表排序算法代价敏感序保持泛化性 learning to rank listwise approach cost-sensitive order preservation generalization

分类号 TP391 [自动化与计算机技术—计算机应用技术] TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献19

1Cao Yunho, Xu Jun, Liu Tieyan, et al. Adapting ranking SVM to document retrieval I-C] //Proe of ACM SIGIR2006. New York: ACM, 2006:186-193.
2Joachims T. Optimizing search engines using click-through data I-C] ]/Proe of ACM SIGKDD2003. New York: ACM, 2002:133-142.
3Crammer K, Singer Y. PRanking with ranking EC//Proe of NIPS2001. Cambridge: MIT, 2001:641-647.
4Shashua A, Levin A. Ranking with large margin principle: Two approaches EC /Proc of NIPS2003. Cambridge: MIT, 2003:937-944.
5Freund Y, Iyer R, Schapire R, et al. An efficient boosting algorithm for combining preferences [J]. Journal of Machine Learning Research, 2003, 4 933-969.
6Cao Zheng, Qin Tao, Liu Tieyan, et al. Learning to rank: From pairwise approach to listwisc approach [C] ]/Proc of ICML. New York ACM, 2007 129-136.
7Qin Tao, Zhang Xudong, Tsai Mingfeng, et al. Query-level loss functions for information retrieval [J]. Journal of Information Processing and Management, 2008, 44 (2) 838- 855.
8Xia Fen, Liu Tieyan, Wang Jue, et al. Listwise approach to learning to rank-theory and algorithm I-C] //Proe of ICML2008. New York.. ACM, 2008:1192-1199.
9Jansen B. The effect of query complexity on web searching results [J]. Journal of Information Research, 2000, 6 (1) : 100-108.
10Jansen B, Spink A. An analysis of web documents retrieved and viewed EC //Proc of IC2003. Las Vegas: CSREA, 2003 .. 64-69.

同被引文献41

1林红,饶云波,李勇.遗传算法在乘务员排班系统中的应用[J].计算机技术与发展,2007,17(1):199-202. 被引量：5
2Ammar R A. Stack-based sorting algorithms. The Journal of Systemsand Software, 1989;9(3) : 225-239.
3Iqbal S Z, Ismail M, Gull H. New relative concatenate sorting algo-rithm .Proceedings of 2012 IEEE International Conference on Com-puter Science and Automation Engineering ( CSAE 2012 ) VOLO ,Zhangjiajie, China, 2012.
4汪维清,罗先文,汪维华.分组排序算法[J].计算机工程与应用,2008,44(33):53-56. 被引量：8
5刘凯鹏,方滨兴.一种基于社会性标注的网页排序算法[J].计算机学报,2010,33(6):1014-1023. 被引量：19
6淦艳,杨有.五种排序算法的性能分析[J].重庆文理学院学报（自然科学版）,2010,29(3):45-50. 被引量：8
7郑伟,王朝坤,刘璋,王建民.一种基于随机游走模型的多标签分类算法[J].计算机学报,2010,33(8):1418-1426. 被引量：57
8程元军,罗利.基于排队论和整数规划的银行柜员弹性排班模型[J].管理学报,2010,7(10):1558-1565. 被引量：25
9汤亚玲,秦锋.高效快速排序算法研究[J].计算机工程,2011,37(6):77-78. 被引量：17
10刘欣.大型集团企业人力资源管理信息化建设[J].电力信息化,2011,9(5):47-50. 被引量：19

引证文献3

1余冬梅.一种基于堆的快速排序算法[J].科学技术与工程,2014,22(35):80-83. 被引量：3
2张昊,王飞.基于概率排序算法的企业排班系统研究[J].电子设计工程,2016,24(21):7-10. 被引量：1
3周祖坤,杨光,冯小坤.面向文档信息检索的排序学习算法[J].自动化技术与应用,2018,37(2):40-45.

二级引证文献4

1吴石松,王建永,严宇平,张璐.企业对虚假信息优化识别仿真研究[J].计算机仿真,2017,34(5):313-316. 被引量：1
2左晓静,谭会君.计算机程序语言中常用排序算法分析研究[J].漯河职业技术学院学报,2018,17(2):54-56. 被引量：1
3陈文,王琳燕.求数据流中位数序列的算法分析[J].信息技术,2018,42(11):34-36. 被引量：3
4王申重.用于解决生产调度冲突问题的并行快速排序算法[J].科技通报,2017,33(1):106-109. 被引量：3

1周俊宇,戴月明,吴定会.基于Pairwise排序学习的因子分解推荐算法[J].计算机应用与软件,2016,33(6):255-259. 被引量：1
2孙林,吴相林,罗松涛,周莉,张红艳.一种基于资源分配动力学的推荐排序算法[J].微计算机信息,2011,27(9):226-228.
3师昕,赵雪青.新型的面向新闻评论摘要采集算法[J].计算机系统应用,2017,26(1):163-167.
4孙林,吴相林,罗松涛,周莉,张红艳.基于二分图资源分配动力学的推荐排序研究[J].计算机工程与设计,2010,31(23):5032-5035. 被引量：2
5常天舒,林鸿飞.维基百科中争议性文章的发现方法研究[J].中文信息学报,2014,28(4):76-83. 被引量：1
6吴胜利,谭延之,施化吉.搜索引擎指标综合特性的评价[J].江苏大学学报（自然科学版）,2015,36(2):181-186. 被引量：4
7程凡,王煦法.一种新型直接优化NDCG的排序模型构造算法[J].中国科学技术大学学报,2013,43(1):65-72. 被引量：1
8林康静,李楠,叶娜,蔡东风.基于多特征的英汉术语译文质量自动评价[J].沈阳航空航天大学学报,2014,31(6):59-65.
9丁伟民.排序学习中的Ranking SVM算法研究[J].科技视界,2013(30):84-84. 被引量：2
10雷武,廖闻剑,彭艳兵.基于随机森林与LambdaMART的搜索排序模型[J].计算机与现代化,2017(3):54-58. 被引量：5

计算机研究与发展

2012年第8期

浏览历史

内容加载中请稍等...

代价敏感的列表排序算法被引量：3

参考文献19

同被引文献41

引证文献3

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

代价敏感的列表排序算法 被引量：3

参考文献19

同被引文献41

引证文献3

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

代价敏感的列表排序算法被引量：3