摘要
针对文本检索中的特征提取和分类问题,提出一种基于内嵌空间支持向量机的特征选择和排序学习方法.与多分类特征选择问题中常用的组合方法不同,本文提出的方法能将一个有序分类问题转化为一个两分类问题,从整体上选择最有效的特征.同时与已有的Ranking SVM相比,该方法在转换过程中学习样本的数量只有线性级的增长,从而大大提高了检索的速度.在人工数据集和标准的文本分类数据集上的实验结果表明,本文所提出的方法能较好地解决文本检索中的特征选择和排序问题.
For feature extraction and classification in text retrieval,a feature selection and sorting learning method based on embedded space support vector machine is proposed.Unlike combination methods commonly used in multi-classification feature selection,the proposed method can transform an ordered classification into a two-classification problem,then choose the most effective feature from the whole.At the same time,comparing with the existing Ranking SVM,the learning samples number of the proposed method just has a linear level increasing during the conversion process,and the retrieval speed is greatly improved.The experimental results on both artificial and standard data sets show that the proposed method can better solve the feature selection and sorting problem in text retrieval.
出处
《信息与控制》
CSCD
北大核心
2010年第5期629-634,共6页
Information and Control
基金
福建省自然科学基金资助项目(2009J05153)
关键词
排序学习
支持向量机
文本检索
特征选择
learning to rank
support vector machine
text retrieval
feature selection