The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based o...The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method.展开更多
In this paper, a model averaging method is proposed for varying-coefficient models with response missing at random by establishing a weight selection criterion based on cross-validation. Under certain regularity condi...In this paper, a model averaging method is proposed for varying-coefficient models with response missing at random by establishing a weight selection criterion based on cross-validation. Under certain regularity conditions, it is proved that the proposed method is asymptotically optimal in the sense of achieving the minimum squared error.展开更多
Next Generation Sequencing (NGS) provides an effective basis for estimating the survival time of cancer patients, but it also poses the problem of high data dimensionality, in addition to the fact that some patients d...Next Generation Sequencing (NGS) provides an effective basis for estimating the survival time of cancer patients, but it also poses the problem of high data dimensionality, in addition to the fact that some patients drop out of the study, making the data missing, so a method for estimating the mean of the response variable with missing values for the ultra-high dimensional datasets is needed. In this paper, we propose a two-stage ultra-high dimensional variable screening method, RF-SIS, based on random forest regression, which effectively solves the problem of estimating missing values due to excessive data dimension. After the dimension reduction process by applying RF-SIS, mean interpolation is executed on the missing responses. The results of the simulated data show that compared with the estimation method of directly deleting missing observations, the estimation results of RF-SIS-MI have significant advantages in terms of the proportion of intervals covered, the average length of intervals, and the average absolute deviation.展开更多
Objective: To assess the missed opportunities from the diagnosis of bacilliferous pulmonary tuberculosis by optical microscopy compared to GeneXpert MTB/RIF between 2015 and 2019. Methods: This is a retrospective anal...Objective: To assess the missed opportunities from the diagnosis of bacilliferous pulmonary tuberculosis by optical microscopy compared to GeneXpert MTB/RIF between 2015 and 2019. Methods: This is a retrospective analysis of the diagnostic results of bacilliferous pulmonary tuberculosis in patients suspected of pulmonary tuberculosis at their first episode during the period. GeneXpert MTB/RIF (GeneXpert) and optical microscopy (OM) after Ziehl-Neelsen stained smear were performed on each patient’s sputum or gastric tubing fluid sample. Results: Among 341 patients suspected of pulmonary tuberculosis, 229 patients were declared bacilliferous tuberculosis by the two tests (67%), 220 patients by GeneXpert and 95 patients by OM, i.e. 64.5% versus 28% (p i.e. 58.5% of the positive cases detected by the two tests (134/229 patients) and 39.3% of the patients suspected of tuberculosis (134/341 patients). On the other hand, among 95 patients declared positive by OM, the GeneXpert ignored 9 (9.5%), i.e. 4% of all the positive cases detected by the two diagnostic tests (9/229 patients) and 3% of the patients suspected of tuberculosis (9/341 patients). The differences observed between the results of the two tests were statistically significant at the 5% threshold (p Conclusion: This study reveals missed diagnostic opportunities for bacilliferous pulmonary mycobacteriosis, statistically significant with optical microscopy than GeneXpert. The GeneXpert/optical microscopy couple could be a good contribution to the strategies for the elimination of pulmonary tuberculosis in sub-Saharan Africa.展开更多
In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are o...In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are obtained and the confidence regions for the parameter can be constructed easily.展开更多
文摘The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method.
文摘In this paper, a model averaging method is proposed for varying-coefficient models with response missing at random by establishing a weight selection criterion based on cross-validation. Under certain regularity conditions, it is proved that the proposed method is asymptotically optimal in the sense of achieving the minimum squared error.
文摘Next Generation Sequencing (NGS) provides an effective basis for estimating the survival time of cancer patients, but it also poses the problem of high data dimensionality, in addition to the fact that some patients drop out of the study, making the data missing, so a method for estimating the mean of the response variable with missing values for the ultra-high dimensional datasets is needed. In this paper, we propose a two-stage ultra-high dimensional variable screening method, RF-SIS, based on random forest regression, which effectively solves the problem of estimating missing values due to excessive data dimension. After the dimension reduction process by applying RF-SIS, mean interpolation is executed on the missing responses. The results of the simulated data show that compared with the estimation method of directly deleting missing observations, the estimation results of RF-SIS-MI have significant advantages in terms of the proportion of intervals covered, the average length of intervals, and the average absolute deviation.
文摘Objective: To assess the missed opportunities from the diagnosis of bacilliferous pulmonary tuberculosis by optical microscopy compared to GeneXpert MTB/RIF between 2015 and 2019. Methods: This is a retrospective analysis of the diagnostic results of bacilliferous pulmonary tuberculosis in patients suspected of pulmonary tuberculosis at their first episode during the period. GeneXpert MTB/RIF (GeneXpert) and optical microscopy (OM) after Ziehl-Neelsen stained smear were performed on each patient’s sputum or gastric tubing fluid sample. Results: Among 341 patients suspected of pulmonary tuberculosis, 229 patients were declared bacilliferous tuberculosis by the two tests (67%), 220 patients by GeneXpert and 95 patients by OM, i.e. 64.5% versus 28% (p i.e. 58.5% of the positive cases detected by the two tests (134/229 patients) and 39.3% of the patients suspected of tuberculosis (134/341 patients). On the other hand, among 95 patients declared positive by OM, the GeneXpert ignored 9 (9.5%), i.e. 4% of all the positive cases detected by the two diagnostic tests (9/229 patients) and 3% of the patients suspected of tuberculosis (9/341 patients). The differences observed between the results of the two tests were statistically significant at the 5% threshold (p Conclusion: This study reveals missed diagnostic opportunities for bacilliferous pulmonary mycobacteriosis, statistically significant with optical microscopy than GeneXpert. The GeneXpert/optical microscopy couple could be a good contribution to the strategies for the elimination of pulmonary tuberculosis in sub-Saharan Africa.
文摘In this paper, three smoothed empirical log-likelihood ratio functions for the parameters of nonlinear models with missing response are suggested. Under some regular conditions, the corresponding Wilks phenomena are obtained and the confidence regions for the parameter can be constructed easily.