摘要
在响应变量随机缺失情形下,研究超高维线性模型的确定性独立变量筛选问题.首先,使用逻辑线性回归模型拟合响应变量的缺失变量与相应协变量,估计响应变量的缺失概率;然后,建立基于逆概率加权最小二乘的效用函数,通过它将协变量维数降到较低水平;最后,运用基于LASSO惩罚的逆概率加权最小二乘方法对协变量进行更精细的筛选,达到协变量超高维降维的目的.数值模拟和实例分析表明,所研究的变量筛选方法对有限样本的情形表现良好.
In the case of random missed response variables,the independent variable screening method for ultra-high dimensional linear model is considered.Firstly,the missing probability of response variables is estimated by using Logistic regression model for missing indicators and the corresponding covariates.Then,the inverse probability weighted quadratic loss utility functions are established to reduce the dimensionality to a low level.At last,a more refined variable selection method with LASSO penalty is used to the dimensional reduced data.Through some numerical simulation and a real data example,it is concluded that variable screening method is very satisfactory for moderate sample size.
作者
贺佳钰
李建波
周庆燕
姚军娥
王秀平
He Jiayu;Li Jianbo;Zhou Qingyan;Yao Jun'e;Wang Xiuping(School of Mathematics&Statistics,Jiangsu Normal University,Xuzhou 221116,Jiangsu,China;School of Statistics,East China Normal University,Shanghai 200062,China;Department of Gynaecology&Obstetrics,People s Hospital of Dongchangfu District,Liaocheng 252002,Shandong,China;School of Pharmaceutical Sciences,Liaocheng University,Liaocheng 252059,Shandong,China)
出处
《江苏师范大学学报(自然科学版)》
CAS
2020年第1期52-56,共5页
Journal of Jiangsu Normal University:Natural Science Edition
基金
国家自然科学基金面上项目(11571148)
江苏高校优势学科建设工程资助项目,江苏省“六大人才高峰”高层次人才项目(RJFW-038)
江苏省“青蓝工程”中青年学术带头人支持项目,统计与数据科学前沿理论及应用教育部重点实验室(华东师范大学)项目
江苏师范大学本科教育教学教研课题(JYKTZ201907)。
关键词
变量筛选器
超高维线性模型
缺失数据
逆概率加权
LASSO
variable screening
ultra-high dimensional linear model
missing data
inverse probability weighting
LASSO