摘要
对于二类目标特征选择问题,首先讨论了特征空间的线性可分性问题,并给出了其判别条件;其次,通过借鉴支撑矢量机原理,分析了特征可分性判据的基本性质;最后,依据各特征对分类间隔的贡献大小定义了特征有效率,并以此进行特征选择和特征空间降维.实测数据与网络公开UCI(University of california,Irvine)数据库的实验结果表明,与经典的Relief特征选择算法相比,该算法在识别性能和推广能力上明显有所提高.
Firstly, a distinguishable condition is proposed for separating the features by linear classification hyper surface. Secondly, the paper analyses the properties of the feature linear distinguishable criterion based on support vector machines (SVMs). Finally, the efficiency rate of features are defined by the contribution to classes margin of each feature, and a feature selection algorithm is put forward based on the feature efficiency rate. As experimental results show, validated with the actually measuring data and UCI (University of California, Irvine) data, performance of the new feature selection method, such as classification capability and generalized capability are improved obviously in contrast to the classical Relief method.
出处
《软件学报》
EI
CSCD
北大核心
2008年第4期842-850,共9页
Journal of Software
基金
国家自然科学基金No.60402032~~
关键词
特征选择
有效率
分类间隔
支撑矢量机
feature selection
efficiency rate
classe margin
SVM (support vector machine)