期刊文献+

基于随机游走模型的排序学习方法 被引量:2

Ranking Learning Method Based on Random Walk Model
原文传递
导出
摘要 【目的】通过引入随机游走模型,解决有监督排序学习中训练数据的标记信息难以获取的问题。【方法】提出一种基于重启随机游走模型的排序学习方法,通过游走模型完成训练数据的自动标注,降低排序学习对标记数据的依赖性,并在OHSUMED数据集上进行实验。【结果】当已标注样本在数据集中占比50%时,该方法能有效完成排序学习任务,与标注样本占比100%的排序学习算法相比,其排序效果明显优于Rank Net算法,略低于List Net算法。【局限】本文方法要求对每个查询单独进行随机游走,这对实际应用中多样查询下的文档标注工作来说仍然需要花费较多精力来完成。【结论】本文方法有很好的排序学习效果,能有效解决排序学习中训练数据的标注难题。 [Objective] This paper tries to obtain the tagging data of training corpus for supervised ranking learning tasks. [Methods] First, we proposed a ranking learning method based on the random walk model. Then, we used this method to automatically tag the training data, which also reduced the dependency of ranking on the tags. Finally, we examined our method with the OHSUMED data set. [Results] We finished the ranking learning tasks with only half of samples tagged. Compared with algorithms based on all tagged samples, performance of the proposed method was better than the Rank Net algorithm but not as good as the List Net one. [Limitations] Our method requires a random walk for each query, which is time consuming in practice. [Conclusions] The proposed method can effectively rank the learning results of training data.
出处 《数据分析与知识发现》 CSSCI CSCD 2017年第12期41-48,共8页 Data Analysis and Knowledge Discovery
基金 国家社会科学基金项目"社会化信息搜寻认知模型研究"(项目编号:14BTQ049) 江苏省社会科学基金重大项目"习近平总书记构建中国特色哲学社会科学重大命题研究"(项目编号:16ZD004)的研究成果之一
关键词 排序学习 随机游走模型 半监督学习 ListNet Ranking Learning Random Walk Model Semi-supervised Learning ListNet
  • 相关文献

参考文献4

二级参考文献83

  • 1Zhou Z H, Chen K J, Jiang Y. Exploiting unlabeled data in content-based image retrieval. In Proc. the 15th European Conf. Machine Learning ( ECML 2004), Pisa, Italy, Sept. 20- 24, 2004, pp.525-536.
  • 2Li M, Zhou Z H. Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Trans. Systems, Man and Cybernetics - Part A: Systems and Humans, 2007, 37(6): 1088-1098.
  • 3Levin A, Viola P, Freund Y. Unsupervised improvement of visual detectors using Co-Training. In Proc. the Int. Conf. Computer Vision, Graz, Austria, April 1-3, 2003, pp.626-633.
  • 4Nigam K, McCallum A K, Thrun S, Mitchell T. Text classification from labeled and unlabeled documents using EM. Machine Learning, 2000, 39(2/3): 103-134.
  • 5Kiritchenko S, Matwin S. Email classification with Co- Training. In Proc. the 2001 Conf. the Centre for Advanced Studies on Collaborative Research ( CASCON 2001), Toronto, Canada, Nov. 5-7, 2001, pp.8-19.
  • 6Nigam K, Ghani R. Analyzing the effectiveness and applicability of Co-Training. In Proc. the 9th Int. Conf. Information and Knowledge Management, McLean, USA, Nov. 6-11, 2000, pp.86-93.
  • 7Lewis D D, Gale A W. A sequential algorithm for training text classifiers. In Proc. the Special Interest Group on Info. Retrieval, Dublin, Irland, July 3-6, 1994, pp.3-12.
  • 8Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological, 1977, 39(1): 1-38.
  • 9Blum A, Mitchell T. Combining labeled and unlabeled data with Co-Training. In Proc. the 11th Annual Conf. Computational Learning Theory (COLT1998), Madison, USA, July 24-26, 1998, pp.92-100.
  • 10Muslea I, Minton S, Knoblock C A. Selective sampling with redundant views. In Proc. the 17th National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, Austin, USA, Jul. 30- Aug. 3, 2000, pp.621-626.

共引文献21

同被引文献11

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部