摘要
对带有删失的生存数据的分析是高维稀疏回归分析的一个重要组成部分。然而,过去的大量相关工作都是建立在干净原始数据这一基础之上的,实践中面对的往往都是缺失数据或带有测量误差的数据,因此对此类数据的研究实用性更强。而在已有的高维生存分析数据相关文献中,关于带有测量误差情形下变量选择的研究还略显空白。在此背景下,提出一种基于伪得分函数和最近邻半正定投影的方法,对带有测量误差的高维可加风险模型进行变量选择,并且通过随机模拟和实际数据分析验证了该方法可以取得很好的效果。
Analysis with censored survival data plays an important role in high-dimensional sparse modeling.Much theoretical and applied work is based on clean data.However,we often face corrupted data with missing data or error-in-variable data and as a result analysis on error-in-variable data is more useful.While in the known literature,relatively few work has been done on high-dimensional survival data variable selecting with measurement error.In this situation,we propose a new method to select variables in high-dimensional additive hazards model with error-in-variable data,which combines the pseudoscore function and the nearest positive semi-definite projection.Our numerical studies and real data analysis show that the method has good performance and can select the nonzero coefficients successfully.
作者
张家睿
吴耀华
ZHANG Jiarui;WU Yaohua(School of Management,University of Science and Technology of China,Hefei 230026,China;Zhejiang Institute of Research and Innovation,University of Hong Kong,Hangzhou 310000,China)
出处
《中国科学院大学学报(中英文)》
CSCD
北大核心
2023年第1期12-20,共9页
Journal of University of Chinese Academy of Sciences
基金
国家自然科学基金(72071187,11671374,71731010,71921001)资助。
关键词
变量选择
高维
可加风险模型
测量误差
variable selection
high-dimensional
additive hazard model
error-in-variable data