In this paper, we study the sure independence screening of ultrahigh-dimensional censored data with varying coefficient single-index model. This general model framework covers a large number of commonly used survival ...In this paper, we study the sure independence screening of ultrahigh-dimensional censored data with varying coefficient single-index model. This general model framework covers a large number of commonly used survival models. The property that the proposed method is not derived for a specific model is appealing in ultrahigh dimensional regressions, as it is difficult to specify a correct model for ultrahigh dimensional predictors.Once the assuming data generating process does not meet the actual one, the screening method based on the model will be problematic. We establish the sure screening property and consistency in ranking property of the proposed method. Simulations are conducted to study the finite sample performances, and the results demonstrate that the proposed method is competitive compared with the existing methods. We also illustrate the results via the analysis of data from The National Alzheimers Coordinating Center(NACC).展开更多
With the rapid-growth-in-size scientific data in various disciplines, feature screening plays an important role to reduce the high-dimensionality to a moderate scale in many scientific fields. In this paper, we introd...With the rapid-growth-in-size scientific data in various disciplines, feature screening plays an important role to reduce the high-dimensionality to a moderate scale in many scientific fields. In this paper, we introduce a unified and robust model-free feature screening approach for high-dimensional survival data with censoring, which has several advantages: it is a model-free approach under a general model framework, and hence avoids the complication to specify an actual model form with huge number of candidate variables; under mild conditions without requiring the existence of any moment of the response, it enjoys the ranking consistency and sure screening properties in ultra-high dimension. In particular, we impose a conditional independence assumption of the response and the censoring variable given each covariate, instead of assuming the censoring variable is independent of the response and the covariates. Moreover, we also propose a more robust variant to the new procedure, which possesses desirable theoretical properties without any finite moment condition of the predictors and the response. The computation of the newly proposed methods does not require any complicated numerical optimization and it is fast and easy to implement. Extensive numerical studies demonstrate that the proposed methods perform competitively for various configurations. Application is illustrated with an analysis of a genetic data set.展开更多
The curse of high-dimensionality has emerged in the statistical fields more and more frequently.Many techniques have been developed to address this challenge for classification problems. We propose a novel feature scr...The curse of high-dimensionality has emerged in the statistical fields more and more frequently.Many techniques have been developed to address this challenge for classification problems. We propose a novel feature screening procedure for dichotomous response data. This new method can be implemented as easily as t-test marginal screening approach, and the proposed procedure is free of any subexponential tail probability conditions and moment requirement and not restricted in a specific model structure. We prove that our method possesses the sure screening property and also illustrate the effect of screening by Monte Carlo simulation and apply it to a real data example.展开更多
基金Supported by the National Natural Science Foundation of China(No.11801567)
文摘In this paper, we study the sure independence screening of ultrahigh-dimensional censored data with varying coefficient single-index model. This general model framework covers a large number of commonly used survival models. The property that the proposed method is not derived for a specific model is appealing in ultrahigh dimensional regressions, as it is difficult to specify a correct model for ultrahigh dimensional predictors.Once the assuming data generating process does not meet the actual one, the screening method based on the model will be problematic. We establish the sure screening property and consistency in ranking property of the proposed method. Simulations are conducted to study the finite sample performances, and the results demonstrate that the proposed method is competitive compared with the existing methods. We also illustrate the results via the analysis of data from The National Alzheimers Coordinating Center(NACC).
基金supported by the Research Grant Council of Hong Kong (Grant Nos. 509413 and 14311916)Direct Grants for Research of The Chinese University of Hong Kong (Grant Nos. 3132754 and 4053235)+3 种基金the Natural Science Foundation of Jiangxi Province (Grant No. 20161BAB201024)the Key Science Fund Project of Jiangxi Province Eduction Department (Grant No. GJJ150439)National Natural Science Foundation of China (Grant Nos. 11461029, 11601197 and 61562030)the Canadian Institutes of Health Research (Grant No. 145546)
文摘With the rapid-growth-in-size scientific data in various disciplines, feature screening plays an important role to reduce the high-dimensionality to a moderate scale in many scientific fields. In this paper, we introduce a unified and robust model-free feature screening approach for high-dimensional survival data with censoring, which has several advantages: it is a model-free approach under a general model framework, and hence avoids the complication to specify an actual model form with huge number of candidate variables; under mild conditions without requiring the existence of any moment of the response, it enjoys the ranking consistency and sure screening properties in ultra-high dimension. In particular, we impose a conditional independence assumption of the response and the censoring variable given each covariate, instead of assuming the censoring variable is independent of the response and the covariates. Moreover, we also propose a more robust variant to the new procedure, which possesses desirable theoretical properties without any finite moment condition of the predictors and the response. The computation of the newly proposed methods does not require any complicated numerical optimization and it is fast and easy to implement. Extensive numerical studies demonstrate that the proposed methods perform competitively for various configurations. Application is illustrated with an analysis of a genetic data set.
基金supported by Graduate Innovation Foundation of Shanghai University of Finance and Economics of China (Grant Nos. CXJJ-2014-459 and CXJJ-2015-430)National Natural Science Foundation of China (Grant No. 71271128), the State Key Program of National Natural Science Foundation of China (Grant No. 71331006), the State Key Program in the Major Research Plan of National Natural Science Foundation of China (Grant No. 91546202)+1 种基金National Center for Mathematics and Interdisciplinary Sciences, Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences (Grant No. 2008DP173182)Innovative Research Team in Shanghai University of Finance and Economics (Grant No. IRT13077)
文摘The curse of high-dimensionality has emerged in the statistical fields more and more frequently.Many techniques have been developed to address this challenge for classification problems. We propose a novel feature screening procedure for dichotomous response data. This new method can be implemented as easily as t-test marginal screening approach, and the proposed procedure is free of any subexponential tail probability conditions and moment requirement and not restricted in a specific model structure. We prove that our method possesses the sure screening property and also illustrate the effect of screening by Monte Carlo simulation and apply it to a real data example.