In this paper, we investigate the model checking problem for a general linear model with nonignorable missing covariates. We show that, without any parametric model assumption for the response probability, the least s...In this paper, we investigate the model checking problem for a general linear model with nonignorable missing covariates. We show that, without any parametric model assumption for the response probability, the least squares method yields consistent estimators for the linear model even if only the complete data are applied. This makes it feasible to propose two testing procedures for the corresponding model checking problem: a score type lack-of-fit test and a test based on the empirical process. The asymptotic properties of the test statistics are investigated. Both tests are shown to have asymptotic power 1 for local alternatives converging to the null at the rate n-r, 0 ≤ r 〈 1/2. Simulation results show that both tests perform satisfactorily.展开更多
Feature screening with missing data is a critical problem but has not been well addressed in theliterature. In this discussion we propose a new screening index based on “information value” andapply it to feature scr...Feature screening with missing data is a critical problem but has not been well addressed in theliterature. In this discussion we propose a new screening index based on “information value” andapply it to feature screening with missing covariates.展开更多
Relative-risk models are often used to characterize the relationship between survival time and time-dependent covariates. When the covariates are observed, the estimation and asymptotic theory for parameters of intere...Relative-risk models are often used to characterize the relationship between survival time and time-dependent covariates. When the covariates are observed, the estimation and asymptotic theory for parameters of interest are available; challenges remain when missingness occurs. A popular approach at hand is to jointly model survival data and longitudinal data. This seems efficient, in making use of more information, but the rigorous theoretical studies have long been ignored. For both additive risk models and relative-risk models, we consider the missing data nonignorable. Under general regularity conditions, we prove asymptotic normality for the nonparametric maximum likelihood estimators.展开更多
In recent years,there has been a large amount of literature on missing data.Most of them focus on situations where there is only missingness in response or covariate.In this paper,we consider the adequacy check for th...In recent years,there has been a large amount of literature on missing data.Most of them focus on situations where there is only missingness in response or covariate.In this paper,we consider the adequacy check for the linear regression model with the response and covariates missing simultaneously.We apply model adjustment and inverse probability weighting methods to deal with the missingness of response and covariate,respectively.In order to avoid the curse of dimension,we propose an empirical process test with the linear indicator weighting function.The asymptotic properties of the proposed test under the null,local and global alternative hypothe tical models are rigorously investigated.A consisten t wild boot strap method is developed to approximate the critical value.Finally,simulation studies and real data analysis are performed to show that the proposed method performed well.展开更多
This paper presents a novel class of semiparametric estimating functions for the additive model with right-censored data that are obtained from general biased-sampling. The new estimator can be obtained using a weight...This paper presents a novel class of semiparametric estimating functions for the additive model with right-censored data that are obtained from general biased-sampling. The new estimator can be obtained using a weighted estimating equation for the covariate coefficients, by embedding the biased-sampling data into left-truncated and right-censored data. The asymptotic properties(consistency and asymptotic normality) of the proposed estimator are derived via the modern empirical processes theory. Based on the cumulative residual processes, we also propose graphical and numerical methods to assess the adequacy of the additive risk model.The good finite-sample performance of the proposed estimator is demonstrated by simulation studies and two applications of real datasets.展开更多
基金supported by the National Natural Science Foundation of China (No. 10901162,10926073)China Postdoctoral Science Foundation and the President Fund of GUCAS+1 种基金the foundation of the Key Laboratory of Random Complex Structures and Data Science, CASsupported by a research grant from the Research Committee, The Hong Kong Polytechnic University
文摘In this paper, we investigate the model checking problem for a general linear model with nonignorable missing covariates. We show that, without any parametric model assumption for the response probability, the least squares method yields consistent estimators for the linear model even if only the complete data are applied. This makes it feasible to propose two testing procedures for the corresponding model checking problem: a score type lack-of-fit test and a test based on the empirical process. The asymptotic properties of the test statistics are investigated. Both tests are shown to have asymptotic power 1 for local alternatives converging to the null at the rate n-r, 0 ≤ r 〈 1/2. Simulation results show that both tests perform satisfactorily.
文摘Feature screening with missing data is a critical problem but has not been well addressed in theliterature. In this discussion we propose a new screening index based on “information value” andapply it to feature screening with missing covariates.
基金funded by National Natural Science Foundation of China(NSFC No.11771241)Natural Science Foundation of Anhui Province(No.1708085QA14)
文摘Relative-risk models are often used to characterize the relationship between survival time and time-dependent covariates. When the covariates are observed, the estimation and asymptotic theory for parameters of interest are available; challenges remain when missingness occurs. A popular approach at hand is to jointly model survival data and longitudinal data. This seems efficient, in making use of more information, but the rigorous theoretical studies have long been ignored. For both additive risk models and relative-risk models, we consider the missing data nonignorable. Under general regularity conditions, we prove asymptotic normality for the nonparametric maximum likelihood estimators.
基金This research was supported by Key projects of philosophy and social science in Beijing(15ZDA47)National Natural Science Foundation of China(Grant Nos.11571340,11971045)Beijing Natural Science Foundation(1202001)and the Open Project of Key Laboratory of Big Data Mining and Knowledge Management,Chinese Academy of Sciences.
文摘In recent years,there has been a large amount of literature on missing data.Most of them focus on situations where there is only missingness in response or covariate.In this paper,we consider the adequacy check for the linear regression model with the response and covariates missing simultaneously.We apply model adjustment and inverse probability weighting methods to deal with the missingness of response and covariate,respectively.In order to avoid the curse of dimension,we propose an empirical process test with the linear indicator weighting function.The asymptotic properties of the proposed test under the null,local and global alternative hypothe tical models are rigorously investigated.A consisten t wild boot strap method is developed to approximate the critical value.Finally,simulation studies and real data analysis are performed to show that the proposed method performed well.
基金supported by National Natural Science Foundation of China(Grant Nos.11771133 and 11401194)the Natural Science Foundation of Hunan Province of China(Grant No.2017JJ3021)+2 种基金Zhao’s work was supported by National Natural Science Foundation of China(Grant No.11771366)Zhou’s work was supported by the State Key Program of National Natural Science Foundation of China(Grant No.71331006)the State Key Program in the Major Research Plan of National Natural Science Foundation of China(Grant No.91546202)
文摘This paper presents a novel class of semiparametric estimating functions for the additive model with right-censored data that are obtained from general biased-sampling. The new estimator can be obtained using a weighted estimating equation for the covariate coefficients, by embedding the biased-sampling data into left-truncated and right-censored data. The asymptotic properties(consistency and asymptotic normality) of the proposed estimator are derived via the modern empirical processes theory. Based on the cumulative residual processes, we also propose graphical and numerical methods to assess the adequacy of the additive risk model.The good finite-sample performance of the proposed estimator is demonstrated by simulation studies and two applications of real datasets.