期刊文献+

基于ListNet排序学习的特征处理方法 被引量:2

A Feature Processing Method Based on Ranking Algorithm ListNet
下载PDF
导出
摘要 排序学习(learning to rank)是一种机器学习与信息检索的交叉学科,可以从大量的包含标记的训练集中自动学习排序模型。特征选取对于排序模型的预测结果有很大的影响,而排序学习对其特征领域的研究却很少。针对这一问题,提出一种特征处理方法:利用基于主成分分析(PCA)的特征重组方法扩展数据集,然后在扩展后的数据集上进行排序算法隐含的特征选择。在LETOR4.0数据集(MQ2007,MQ2008)上基于排序评测函数对List Net排序算法进行验证。通过对比特征处理前后的排序性能差异,以及添加新特征的个数对排序结果的影响,实验结果表明,经过特征处理的利用排序学习算法构建的排序函数一般要优于原始的排序函数。 Learning to rank is an interdisciplinary of machine learning and information retrieval and learns ranking model automaticallyfrom given training data set. The feature space has a great influence on the performance of learning to rank approach,however,there area little research in terms of feature generation. For this,we propose one feature analysis method which extends data set by feature recom-bination based on PCA,and then performs feature selection implied by learning to rank methods on the extended data set. We evaluateranking algorithm ListNet on the LETOR4. 0 (MQ2007,MQ2008) data set based on ranking evaluation index,and experimentally com-pare the performance of ListNet using the data set with new feature vectors and not,as well as the impact of the number of the new fea-tures added to the result of sort. The experiment shows that ranking functions learned through learning to rank method based on the fea-ture analysis methods outperform the original ones.
作者 李伟宁 王磊 LI Wei-ning;WANG Lei(School of Computer,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;School of Electronic Science and Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210003,China)
出处 《计算机技术与发展》 2018年第9期30-33,37,共5页 Computer Technology and Development
基金 国家"863"高技术发展计划项目(2006AA01Z201)
关键词 信息检索 排序学习 特征处理 ListNet information retrieval learning to rank feature selection ListNet
  • 相关文献

参考文献5

二级参考文献42

  • 1陈健,印鉴.基于影响集的协作过滤推荐算法[J].软件学报,2007,18(7):1685-1694. 被引量:59
  • 2Herbrich R, Grapel T, Obermayer K. Large Margin Rank Boundaries for Ordinal Regression[M]. Cambridge, USA: MIT Press, 2000: 115-132.
  • 3Joachims T. Optimizing Search Engines Using Click Through Data[C]//Proc. of Conference on Knowledge Discovery and Data Mining. Edmonton, Canda: ACM Press, 2002:134-142.
  • 4Cao Zhe, Qin Tao, Liu Tieyan, et al. Learning to Rank: From Pairwise Approach to Listwise Approach[C]//Proc. of ICML'07. Corvallis, USA: [s. n.], 2007: 129-136.
  • 5Qin Tao, Zhang Xudong. Query-level Loss Functions for Information Retrieval[J]. Information Processing & Management, 2007, 44(2): 838-855.
  • 6Teo Choon-Hui, Smola A, Vishwanathan S V N. A Scalable Modular Convex Solver for Regularized Risk Minimization[C]// Proc. of International Conference on Knowledge Discovery and Data Mining. San Jose, California, USA: [s. n.], 2007: 48-57.
  • 7吴佳金,杨志豪,林原,等.基于改进Pairwise损失函数的排序学习方法[C]//第六届全国信息检索学术会议论文集,2010.
  • 8Azizi A. Efficient IRIS recognition through improvement of fea-ture extraction and subset selection [ J ]. International journalof computer science and information security ,2009,2( 1) :72-73.
  • 9Yu Lei, Liu Huan. Efficient feature selection via analysis ofrelevance and redundancy [ J ]. Machine learning research,2004,5(1):1207-1220.
  • 10Hall M A,Smith L A. Feature subset selection: A correlationbased filter approach [ C ] //Proc of international conference onneural information processing. [ s. 1. ] : [ s. n. ] ,1997 :2-4.

共引文献190

同被引文献15

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部