期刊文献+

高维数据挖掘中的正则化估计新方法

New Regularized Estimation in High-dimensional Data Mining
下载PDF
导出
摘要 针对高维数据的特点并基于线性回归模型,利用变量选择降维技术,提出了一种新的、有效的变量选择(或称特征提取)的正则化估计方法.新的正则化估计方法主要考虑了数据的噪声(方差)对正则化估计的影响,在寻找估计的正则化路径时能对方差进行有效估计,且基于凸优化问题的KKT条件和坐标算法思想给出了正则化估计算法的实施细节.实验结果表明,该方法能够提高高维数据集进行估计和变量选择的准确性,是高维数据挖掘中新的、有效的特征提取方法. According to the feature of high-dimensional data, a new and efficient variable selection method (or feature extraction method) is introduced by using dimensional reduction technique based on the regularized estimation method of linear regression model. The new method takes the influence of the noise (variance) for the regularized estimation into account, which can get the path of the regularized estimation and the estimation of variance. Furthermore, based on the KKT condition and the mind of coordinate-wise algorithm, the details of the algorithm are given for the regularized estimation method. By the result of simulation result, the new method can carry out both estimation and variable selection very well. It is really an efficient feature extraction method for high-dimensional data mining.
出处 《宁夏大学学报(自然科学版)》 CAS 2012年第4期342-345,349,共5页 Journal of Ningxia University(Natural Science Edition)
基金 江苏省自然科学基金资助项目(SBK200920379) 南通大学自然科学基金资助项目(10Z008)
关键词 数据挖掘 高维数据 变量选择 正则化估计 LASSO 坐标算法 data mining high-dimensional data variable selection regulaized estimation least absolute skrinkage and selection operator coordinate-wise algorithm
  • 相关文献

参考文献7

  • 1HASTIE T;TIBSHIRANI R;FRIEDMAN J;范明;柴玉梅;昝红英.统计学习基础一数据挖掘、推理与预测[M]北京:电子工业出版社,2004.
  • 2B(U)HLMANN P,Van de GEER S. Statistics for High-Dimensional Data:Methods,Theory and Applications[M].Berlin.Springer-Verlag Berlin Heidelberg,2011.
  • 3李昕,钱旭,王自强.一种高效的高维异常数据挖掘算法[J].计算机工程,2010,36(21):34-36. 被引量:7
  • 4TIBSHIRANI R. Regression shrinkage and selection via the LASSO[J].Journal of the Royal Statistical Society,Series B:Statistical Methodology,1996,(01):267-288.
  • 5EFRON B,HASTIE T,JOHNSTONE I. Least angleregression[J].Annals of Statistics,2004.407-489.
  • 6ZOU Hui. The adaptive LASSO and its oracle properties[J].Journal of the American Statistical Association,2006,(476):1418-1429.doi:10.1198/016214506000000735.
  • 7FRIEDMAN J,HASTIE T,HOFLING H. Pathwise coordinate optimization[J].The Annals of Applied Statistics,2007,(02):302-332.

二级参考文献5

  • 1Li Haifeng,Jiang Tao,Zhang Keshu.Efficient and Robust Feature Extraction by Maximum Margin Criterion[J].IEEE Transactions on Neural Networks.2006,17(1):157-165.
  • 2Lanckriet G R G,Ghaoui L E,Bhattacharyya C,et al.A Robust Minimax Approach to Classification[J].The Journal of Machine Learning Research,2002,25(3): 555-582.
  • 3Blake C L,Merz C J.UCI Repository of Machine Learning Databases[EB/OL].(1998-05-01).http://www.ics.uci.edu/mlearn/ MLRepository.html.
  • 4Hettich S,Bay S D.KDD CUP 1999 Data[EB/OL].(1999-10-28).http://kdd.ics.uci.edu/databases/kddcup99/kddcup.html.
  • 5王靖.基于鲁棒的全局流形学习方法[J].计算机工程,2008,34(9):192-194. 被引量:6

共引文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部