摘要
变量选择是统计建模的重要环节,选择合适的变量可以建立结构简单、含义明确、预测精准的稳健模型。在实际应用中,有些变量具有群组结构,本文概括了三类群组变量选择惩罚方法,包括处理高度相关变量、仅选择组变量、即选择组又选择单个变量的方法,着重比较了它们的统计性质和优缺点,总结了相关算法和调整参数选择的方法。最后文章归纳了相关应用情况,并讨论了最新发展方向和所面临的挑战。
Variable selection is of great importance in statistical modeling. Suitable variables can make the model simple, meaningful and have favorite performance of prediction. Actually, there exist group structures among the predictors. This paper gives a review of three types of penalized group variable selection methods, including strongly correlated variable selection, group level selection and bi-level selection. We highlight their statistical properties, advantages and disadvantages. We also summarize the algorithms and tuning parameter selection. We discuss their applications, the further studies and the challenges in the end.
出处
《数理统计与管理》
CSSCI
北大核心
2015年第6期978-988,共11页
Journal of Applied Statistics and Management
基金
国家社会科学基金(13&ZD148
13CTJ001)
国家自然科学基金(71471152)
国家统计局项目(2013LZ53
2012LD001)
关键词
群组变量
变量选择
高维数据
惩罚函数
grouping variable, variable selection, high dimensional data, penalty function