摘要
本文基于修正的Cholesky分解提出了一种新的适用于超高维纵向数据的分位数特征筛选方法.首先,构建分位数最优估计方程用于处理潜在的异常值和厚尾分布.然后,基于修正的Cholesky分解对分位数最优估计方程中的协方差矩阵进行建模,进而提出一个迭代特征筛选算法.在一些正则条件下建立了筛选方法的渐近性质,例如筛选的相合性,排序的相合性.随机模拟和酵母细胞周期基因表达数据集的分析表明所提方法不仅能够快速地筛选出重要协变量,而且拥有更高的筛选精度.
In this paper,we propose a new quantile feature screening method based on the modified Cholesky decomposition for ultra-high dimensional longitudinal data.Specially,we introduce the optimal quantile estimating equations to cope with potential outliers and heavy-tailed errors.Then,we model the covariance matrix involved in the optimal quantile estimating equations based on the modified Cholesky decomposition,and subsequently propose an iterative feature screening algorithm.Under some regularity conditions,we establish asymptotic properties of the proposed screening method such as consistency of the screening and ranking.Simulation studies and an analysis of the yeast cell-cycle gene expression dataset show that the proposed method not only selects important covariates quickly but also possesses higher screening accuracy.
作者
陈欣悦
吕晶
Xin Yue CHEN;Jing LV(School of Mathematics and Statistics,SouthwesternUniversity,Chongqing 400715,P.R.China)
出处
《数学学报(中文版)》
CSCD
北大核心
2024年第6期1091-1118,共28页
Acta Mathematica Sinica:Chinese Series
基金
重庆市自然科学基金(cstc2021jcyj-msxmX0502,CSTB2022NSCQ-MSX0852)
中央高校基本科研业务费专项资金(SWU-KU24002)
全国统计科学研究项目(2022LY019)。
关键词
纵向数据
超高维
特征筛选
修正的Cholesky分解
分位数回归
longitudinal data
ultra-high dimensionality
feature screening
modified Cholesky decomposition
quantile regression