期刊文献+

基于偏最小二乘回归的鲁棒性特征选择与分类算法 被引量:9

Robust feature selection and classification algorithm based on partial least squares regression
下载PDF
导出
摘要 提出一种基于偏最小二乘回归的鲁棒性特征选择与分类算法(RFSC-PLSR)用于解决特征选择中特征之间的冗余和多重共线性问题。首先,定义一个基于邻域估计的样本类一致性系数;然后,根据不同k近邻(k NN)操作筛选出局部类分布结构稳定的保守样本,用其建立偏最小二乘回归模型,进行鲁棒性特征选择;最后,在全局结构角度上,用类一致性系数和所有样本的优选特征子集建立偏最小二乘分类模型。从UCI数据库中选择了5个不同维度的数据集进行数值实验,实验结果表明,与支持向量机(SVM)、朴素贝叶斯(NB)、BP神经网络(BPNN)和Logistic回归(LR)四种典型的分类器相比,RFSC-PLSR在低维、中维、高维等不同情况下,分类准确率、鲁棒性和计算效率三种性能上均表现出较强的竞争力。 A Robust Feature Selection and Classification algorithm based on Partial Least Squares Regression (RFSC- PLSR) was proposed to solve the problem of redundancy and muhi-collinearity between features in feature selection. Firstly, the consistency coefficient of sample class based on neighborhood estimation was defined. Then, the k Nearest Neighbor (kNN) operation was used to select the conservative samples with local class structure stability, and the partial least squares regression model was used to construct the robust feature selection. Finally, a partial least squares classification model was constructed using the class consistency coefficient and the preferred feature subset for all samples from a global structure perspective. Five data sets of different dimensions were selected from the UCI database for numerical experiments. The experimental results show that compared with four typical classifiers--Support Vector Machine (SVM), Naive Bayes (NB), Back-Propagation Neural Network (BPNN) and Logistic Regression (LR), RFSC-PLSR is more efficient in low-dimensional, medium-dimension, high-dimensional and other different cases, and shows stronger competitiveness in classification accuracy, robustness and computational efficiency.
出处 《计算机应用》 CSCD 北大核心 2017年第3期871-875,共5页 journal of Computer Applications
基金 国家自然科学基金资助项目(U1304602 61473266 61305080) 河南省高等学校重点科研项目(15A120016)~~
关键词 偏最小二乘回归 K近邻 噪声样本 特征选择 鲁棒性 Partial Least Squares Regression (PLSR) k Nearest Neighbor (kNN) noise sample feature selection robust
  • 相关文献

参考文献10

二级参考文献119

  • 1单丽莉,刘秉权,孙承杰.文本分类中特征选择方法的比较与改进[J].哈尔滨工业大学学报,2011,43(S1):319-324. 被引量:25
  • 2孙国菊,张杰.中文文本分类的特征选取评价[J].哈尔滨理工大学学报,2005,10(1):76-78. 被引量:14
  • 3寇莎莎,魏振军.自动文本分类中权值公式的改进[J].计算机工程与设计,2005,26(6):1616-1618. 被引量:25
  • 4Wang Lei.Feature Selection with Kernel Class Separability.IEEE Trans on Pattern Analysis and Machine Intelligence,2008,30 (9):1534-1546.
  • 5Liu Huan,Yu Lei.Toward Integrating Feature Selection Algorithms for Classification and Clustering.IEEE Trans on Knowledge and Data Engineering,2005,17(4):491 -502.
  • 6Webb A R.Statistical Pattern Recognition.2nd Edition.New York,USA:John Wiley & Sons,2002.
  • 7Narendra P M,Fukunaga K.A Branch and Bound Algorithm for Feature Subset Selection.IEEE Trans on Computers,1977,26 (9):917 -922.
  • 8Liu H,Motoda H.Feature Selection for Knowledge Discovery and Data Mining.Boston,USA:Kluwer Academic,1998.
  • 9Busetti F.Simulated Annealing Overview[EB/OL].[2009-05-03].http://www.geocities.com/francorbusetti/saweb.pdf.
  • 10Muller K R,Mika S,Ratsch G,et al.An Introduction to KernelBased Learning Algorithms.IEEE Trans on Neural Networks,2001,12(2):181 -201.

共引文献65

同被引文献59

引证文献9

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部