期刊文献+

缺失数据下超高维线性模型的变量筛选

Variables screening of ultra-high dimensional linear model with missing data
下载PDF
导出
摘要 在响应变量随机缺失情形下,研究超高维线性模型的确定性独立变量筛选问题.首先,使用逻辑线性回归模型拟合响应变量的缺失变量与相应协变量,估计响应变量的缺失概率;然后,建立基于逆概率加权最小二乘的效用函数,通过它将协变量维数降到较低水平;最后,运用基于LASSO惩罚的逆概率加权最小二乘方法对协变量进行更精细的筛选,达到协变量超高维降维的目的.数值模拟和实例分析表明,所研究的变量筛选方法对有限样本的情形表现良好. In the case of random missed response variables,the independent variable screening method for ultra-high dimensional linear model is considered.Firstly,the missing probability of response variables is estimated by using Logistic regression model for missing indicators and the corresponding covariates.Then,the inverse probability weighted quadratic loss utility functions are established to reduce the dimensionality to a low level.At last,a more refined variable selection method with LASSO penalty is used to the dimensional reduced data.Through some numerical simulation and a real data example,it is concluded that variable screening method is very satisfactory for moderate sample size.
作者 贺佳钰 李建波 周庆燕 姚军娥 王秀平 He Jiayu;Li Jianbo;Zhou Qingyan;Yao Jun'e;Wang Xiuping(School of Mathematics&Statistics,Jiangsu Normal University,Xuzhou 221116,Jiangsu,China;School of Statistics,East China Normal University,Shanghai 200062,China;Department of Gynaecology&Obstetrics,People s Hospital of Dongchangfu District,Liaocheng 252002,Shandong,China;School of Pharmaceutical Sciences,Liaocheng University,Liaocheng 252059,Shandong,China)
出处 《江苏师范大学学报(自然科学版)》 CAS 2020年第1期52-56,共5页 Journal of Jiangsu Normal University:Natural Science Edition
基金 国家自然科学基金面上项目(11571148) 江苏高校优势学科建设工程资助项目,江苏省“六大人才高峰”高层次人才项目(RJFW-038) 江苏省“青蓝工程”中青年学术带头人支持项目,统计与数据科学前沿理论及应用教育部重点实验室(华东师范大学)项目 江苏师范大学本科教育教学教研课题(JYKTZ201907)。
关键词 变量筛选器 超高维线性模型 缺失数据 逆概率加权 LASSO variable screening ultra-high dimensional linear model missing data inverse probability weighting LASSO
  • 相关文献

参考文献2

二级参考文献11

  • 1Horvitz D G, Thompson D J. A generalization of sampling without replacement from a finite universe[J]. Journal of the American Statistical Association, 1952, 47(260): 663-685.
  • 2Robins J M, Rotnitzky A, Zhao L P. Estimation of regression coefficients when some regressors are not always observed[J]. Journal of the American Statistical Association, 1994, 89(427): 846-866.
  • 3Robins J M, Rotnitzky A, Zhao L P. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data[J]. JournM of the American Statistical Association, 1995, 90(429): 106-121.
  • 4Chen P. Nonparametric estimation of mean functionals with data missing at random[J]. Journal of the American Statistical Association, 1994, 89(425): 81-87.
  • 5Xue L G. Empirical likelihood confidence intervals for response mean with data missing at random[J]. Scandinavian Journal of Statistics, 2009, 36(4): 671-685.
  • 6Chen X R, Alan T K, Zhou Y. Efficient quantile regression analysis with missing observations[J]. Journal of the American Statistical Association, 2015, 110(510): 723-741.
  • 7Little R J A, Rubin D B. Statistical Analysis with Missing Data[M]. John Wiley & Sons, Inc., New York, 1987.
  • 8Xia Y C, Tong H. An adaptive estimation of dimension reduction space[J]. Journal of the Royal Statistical Society, Series B, 2002, 64(3): 363-410.
  • 9Xia Y C. Asymptotic distributions for two estimators of the single-index model[J]. Econometric Theory, 2006, 22(4): 1112-1137.
  • 10Chiou J M, Miiller H G. Nonparametric quasi-likelihood[J]. Annals of Statistics, 1999, 27(1): 36-64.

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部