期刊文献+

内嵌空间排序支持向量机及其在文本检索中的应用 被引量:1

Applications of Embedded Space Ranking SVM to Document Retrieval
下载PDF
导出
摘要 针对文本检索中的特征提取和分类问题,提出一种基于内嵌空间支持向量机的特征选择和排序学习方法.与多分类特征选择问题中常用的组合方法不同,本文提出的方法能将一个有序分类问题转化为一个两分类问题,从整体上选择最有效的特征.同时与已有的Ranking SVM相比,该方法在转换过程中学习样本的数量只有线性级的增长,从而大大提高了检索的速度.在人工数据集和标准的文本分类数据集上的实验结果表明,本文所提出的方法能较好地解决文本检索中的特征选择和排序问题. For feature extraction and classification in text retrieval,a feature selection and sorting learning method based on embedded space support vector machine is proposed.Unlike combination methods commonly used in multi-classification feature selection,the proposed method can transform an ordered classification into a two-classification problem,then choose the most effective feature from the whole.At the same time,comparing with the existing Ranking SVM,the learning samples number of the proposed method just has a linear level increasing during the conversion process,and the retrieval speed is greatly improved.The experimental results on both artificial and standard data sets show that the proposed method can better solve the feature selection and sorting problem in text retrieval.
出处 《信息与控制》 CSCD 北大核心 2010年第5期629-634,共6页 Information and Control
基金 福建省自然科学基金资助项目(2009J05153)
关键词 排序学习 支持向量机 文本检索 特征选择 learning to rank support vector machine text retrieval feature selection
  • 相关文献

参考文献13

  • 1Robertson S, Walker S, Hancock-Beaulieu M, et al. Okapi at TREC-3[C]//Proceedings of Text REtrieval Conference. Gaithersburg, MD, USA: National Institute of Standards and Technology Special Publication, 1994: 109-126.
  • 2Ponte J M, Croft W B. A language modeling approach to information retrieval[C]//Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NJ, USA: ACM, 1998: 275 -281.
  • 3Cooper W S, Gey F C, Dabney D E Probabilistic retrieval based on staged logistic regression[C]//Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NJ, USA: ACM, 1992: 198-210.
  • 4Nallapati R. Discriminative models for information retrieval[C]//Proceedings of the 27th annual international ACM SIGIR conference on Research and development in Information Retrieval. New York, NJ, USA: ACM, 2004: 64-71.
  • 5Herbrich R, Graepel T, Obermayer K. Large margin rank boundaries for ordinal regiession[M]//Smola A, Bartlett P, Scholkopf B, et al. Advances in Large Margin Classifiers. Cambridge,MA, USA: MIT Press, 2000: 115-132.
  • 6张学工.统计学习理论的本质[M].北京:清华大学出版社,1999.
  • 7Kramer S, Widmer G, Pfahringer B, et al. Prediction of ordinal classes using regression trees[J]. Fundamenta Informaticae, 2001, 47(1/2): 1-13.
  • 8Cao Y B, Xu J, Liu T Y, et al. Adapting ranking SVM to document retrieval[C]//Proceedings of the 29th Annual ACM SIGIR Conference. New York, NJ, USA: ACM, 2006: 186-193.
  • 9Rajaram S, Garg A, Zhou X S, et al. Classification approach towards ranking and sorting problems[M]// Lecture Notes in Computer Science (vol.2837). Berlin, Germany: Springer, 2003: 301-312.
  • 10Guyon I, Weston J, Barnhill S, et al. Gene selection for cancer classification using support vector machines[J]. Machine Learning, 2002, 46(1/2/3): 389-422.

共引文献2

同被引文献12

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部