期刊文献+

适用于多种监督模型的特征选择方法研究 被引量:6

Research on a Common Feature Selection Method for Multiple Supervised Models
下载PDF
导出
摘要 特征选择是模式识别、机器学习、数据挖掘等领域的重要问题之一,近年来已成为研究热点,并涌现出大量的用于选择特征的算法.现有的特征选择算法大多仅面向某一特定领域,其适用范围有限.采用基于Hilbert-Schmidt相关性标准的核方法衡量特征子集与目标对象间的相关程度,提出了一个适用性更广的特征选择方法FSM-HSIC,能较好地统一有监督、半监督和无监督3种模型下的特征选择过程,而且可从核方法的角度对整个过程进行抽象地描述,并深入理解现有的一些算法.同时以该方法为基础针对交互特征选择问题设计了新颖的FSI算法.理论分析和大量真实与仿真实验结果表明,与若干特征选择算法相比较,提出的算法具有良好的效率和稳定性,FSM-HSIC方法对新算法的产生具有重要的指导意义. Feature selection is one of the most important problems in pattern recognition, machine learning and data mining areas, as a basic pre-processing step of compressing data. Most of the current algorithms were proposed separately for some special domain, which limited their extension. Especially, different applications are often under different supervised models, such as supervised, semi-supervised and unsupervised model. A concrete feature selection algorithm is always designed for a given environment. When the setting is changed, the original algorithm, which was running fluently and efficiently, turns to be inefficient, or even useless. Hence a new algorithm should be explored in this condition.This paper presents a common feature selection method based on Hilbert-Schmidt Independence Criterion, evaluating the correlation between feature subset and target concept. Intrinsic properties of feature selection are exploited in this method, under multiple supervised models, like supervised, semi-supervised and unsupervised. And a uniform format is applied. Furthermore, some existing algorithms can be explained from the viewpoint of kernel-based methods, which brings a deeper understanding. And a novel algorithm is derived from this method. It can solve a challenging problem, known as interactive feature selection. The experimental results not only demonstrate the efficiency and stability of the algorithm, but also infer that the method can give a considerable guidance for the production of novel feature selection algorithms.
出处 《计算机研究与发展》 EI CSCD 北大核心 2010年第9期1548-1557,共10页 Journal of Computer Research and Development
基金 国家"八六三"高技术研究发展计划基金项目(2006AA01Z451 2007AA01Z474 2007AA010502) 国家自然科学基金项目(60873204)
关键词 数据挖掘 模式识别 特征选择 核函数方法 交互特征 稳定性 data mining pattern recognition feature selection kernel-based method interactive feature stability
  • 相关文献

参考文献28

  • 1Weber R.Schek H J,Blott S.A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces[C] // Proc of the 24th VLDB Conference.San Francisco:Morgan kaufmann,2008:323-331.
  • 2Chen X.An improved branch and bound algorithm for feature selection[J].Pattern Recognition Letters,2003,24 (12):1925-1933.
  • 3Aha D W,Bankert R L.A comparative evaluation of sequential feature selection algorithms[C] // Learning from Data:AI and Statistics V.Berlin:Springer,1996:199-206.
  • 4Siedlecki W,Sklansky J.A note on genetic algorithms for large-scale feature selection[J].Pattern Recognition Letters,1989,10(5):335-347.
  • 5Loughrey J,Cunningham P.Using early-stopping to avoid overtraining in wrapper-based feature selection employing stochastic search,TCDCD-2005-37[R].Dublin:Department of Computer Science,Trinity College Dublin,2005.
  • 6Zheng Z,Liu H.Spectral feature selection for supervised and unsupervised learning[C] //Proc of the ACM 24th Int Conf on Machine Learning.San Francisco:Morgan Kaufmann,2007:1151-1157.
  • 7Liu T,Liu S P,Chen Z,et al.An evaluation on feature selection for text clustering[C] // Proc of the 20th Int Conf on Machine Learning.Menlo Park,USA:AAAI,2003:177-181.
  • 8Wolf L,Shashua A.Feature selection for unsupervised and supervised inference:the emergence of sparsity in a weighted-based approach[C] // Proc of the 9th IEEE Int Conf on Computer Vision.Washington,DC,USA:IEEE Computer Society,2003:378-384.
  • 9Vapnik V.An overview of statistical learning theory[J].IEEE Trans on Neural Networks,1999,10(5):988-999.
  • 10Mtiller K R,Mika S.An introduction to kernel based learning algorithms[J].IEEE Transa on Neural Networks,2001,12(2):181-202.

共引文献10

同被引文献52

引证文献6

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部