期刊文献+

基于正则互表示的无监督特征选择方法 被引量:6

Unsupervised feature selection method based on regularized mutual representation
下载PDF
导出
摘要 针对高维数据含有的冗余特征影响机器学习训练效率和泛化能力的问题,为提升模式识别准确率、降低计算复杂度,提出了一种基于正则互表示(RMR)性质的无监督特征选择方法。首先,利用特征之间的相关性,建立由Frobenius范数约束的无监督特征选择数学模型;然后,设计分治-岭回归优化算法对模型进行快速优化;最后,根据模型最优解综合评估每个特征的重要性,选出原始数据中具有代表性的特征子集。在聚类准确率指标上,RMR方法与Laplacian方法相比提升了7个百分点,与非负判别特征选择(NDFS)方法相比提升了7个百分点,与正则自表示(RSR)方法相比提升了6个百分点,与自表示特征选择(SR_FS)方法相比提升了3个百分点;在数据冗余率指标上,RMR方法与Laplacian方法相比降低了10个百分点,与NDFS方法相比降低了7个百分点,与RSR方法相比降低了3个百分点,与SR_FS方法相比降低了2个百分点。实验结果表明,RMR方法能够有效地选出重要特征,降低数据冗余率,提升样本聚类准确率。 The redundant features of high-dimensional data affect the training efficiency and generalization ability of machine learning.In order to improve the accuracy of pattern recognition and reduce the computational complexity,an unsupervised feature selection method based on Regularized Mutual Representation(RMR)property was proposed.Firstly,the correlations between features were utilized to establish a mathematical model for unsupervised feature selection constrained by Frobenius norm.Then,a divide-and-conquer ridge regression optimization algorithm was designed to quickly optimize the model.Finally,the importances of the features were jointly evaluated according to the optimal solution to the model,and a representative feature subset was selected from the original data.On the clustering accuracy,RMR method is improved by 7 percentage points compared with the Laplacian method,improved by 7 percentage points compared with the Nonnegative Discriminative Feature Selection(NDFS)method,improved by 6 percentage points compared with the Regularized Self-Representation(RSR)method,and improved by 3 percentage points compared with the SelfRepresentation Feature Selection(SR_FS)method.On the redundancy rate,RMR method is reduced by 10 percentage points compared with the Laplacian method,reduced by 7 percentage points compared with the NDFS method,reduced by 3 percentage points compared with the RSR method,and reduced by 2 percentage points compared with the SR_FS method.The experimental results show that RMR method can effectively select important features,reduce redundancy rate of data and improve clustering accuracy of samples.
作者 汪志远 降爱莲 奥斯曼·穆罕默德 WANG Zhiyuan;JIANG Ailian;Osman MUHAMMAD(College of Information and Computer,Taiyuan University of Technology,Jinzhong Shanxi 030600,China)
出处 《计算机应用》 CSCD 北大核心 2020年第7期1896-1900,共5页 journal of Computer Applications
基金 山西省回国留学人员科研资助项目(2017-051)。
关键词 特征选择 无监督学习 分治算法 岭回归 正则化 feature selection unsupervised learning divide-and-conquer algorithm ridge regression regularization
  • 相关文献

参考文献3

二级参考文献15

  • 1DASH M. Dimensionality reduction of unsupervised data [ C]//Pro- ceedings of the Ninth IEEE International Conference on Tools with Artificial Intelligence. Washington, DC: IEEE Computer Society, 1997:532-539.
  • 2DY J G, BRODLEY C E. Feature subset selection and order identi- fication for unsupervised learning [ C]// Proceedings of the Seven- teenth International Conference on Machine Learning. San Francis- co: Morgan Kaufmann Publishers, 2000:247 -254.
  • 3RODRIGUEZ-LUJAN I, HUERTA R. Quadratic programming fea- ture selection [ J]. Journal of Machine Learning Research, 2010, 11:1491-1516.
  • 4SHI J, MALIK J. Normalized cuts and image segmentation [ J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888 -905.
  • 5BELABBAS M A, WOLFE P J. Spectral methods in machine learn- ing and new strategies for very large datasets [ J]. Proceedings of the National Academy of Sciences, 2009, 106(2):369 -374.
  • 6yon LUXBURG U. A tutorial on spectral clustering [ J]. Statistics and Computing, 2007, 17(4): 395-416.
  • 7FOWLKES C, BELONGIE S, CHUNG F, et al. Spectral grouping using the Nystrom method [ J]. IEEE Transactions on Pattern Analy-sis and Machine Intelligence, 2004, 26(2):214 -225.
  • 8TSAI C Y, CHIU C C. An efficient feature selection approach for clustering: Using a Gaussian mixture model of data dissimilarity [ C]//2007 International Conference on Computational Science and its Applications. Berlin: Springer-Verlag, 2007:1107-1118.
  • 9HE XIAO-FEI, CAI DENG, NIYOGI P. Laplacian score for feature selection [ C]// Advances in Neural Information Processing Systems 18. Cambridge, MA: MIT Press, 2006:507-514.
  • 10MITRA P. Unsupervised feature selection using feature similarity [ J]. IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2002, 24(3) : 301 -312.

共引文献34

同被引文献71

引证文献6

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部