摘要
针对高维线性模型中微阵列数据的变量选择问题,尤其自变量数量远远大于样本数量时,提出一种多阶段变量选择算法。算法基于阈值化弹性网正则化方法,结合逐级多重假设检验,能在多阶段实现变量选择,并保证模型稀疏性和预测精度。模拟数据和实证研究结果表明,算法具有良好的有限样本性能,能够在保持预测精度的同时恢复真实模型,显著减少假阳性变量数量。
Aiming at the variable selection problem of microarray data in high-dimensional linear model,especially when the number of independent variables for exceeds the number of samples,a multi-stage varia-ble selection algorithm was proposed.The algorithm is based on thresholded elastic net regularization method,combined with step-by-step multiple hypothesis testing.It enabled variable selection at multiple stages while ensuring sparsity and prediction accuracy.Results from simulation data and empirical study demonstrate that the algorithm performs excellently with finite samples,effectively recovering the true model and significantly reducing the number of false positive variables,thereby maintaining prediction accuracy.
作者
陈美岐
张齐
CHEN Mei-qi;ZHANG Qi(School of Mathematics and Statistics,Qingdao University,Qingdao 266071,China)
出处
《青岛大学学报(自然科学版)》
CAS
2024年第3期9-14,共6页
Journal of Qingdao University(Natural Science Edition)
基金
国家社会科学基金一般项目(批准号:NO.21BTJ045)资助。
关键词
高维回归
变量选择
稀疏回归
阈值弹性网
多重假设检验
high-dimensional regression
variable selection
sparse regression
thresholding elastic net
multiple hypothesis testing