An improved method using kernel density estimation (KDE) and confidence level is presented for model validation with small samples. Decision making is a challenging problem because of input uncertainty and only smal...An improved method using kernel density estimation (KDE) and confidence level is presented for model validation with small samples. Decision making is a challenging problem because of input uncertainty and only small samples can be used due to the high costs of experimental measurements. However, model validation provides more confidence for decision makers when improving prediction accuracy at the same time. The confidence level method is introduced and the optimum sample variance is determined using a new method in kernel density estimation to increase the credibility of model validation. As a numerical example, the static frame model validation challenge problem presented by Sandia National Laboratories has been chosen. The optimum bandwidth is selected in kernel density estimation in order to build the probability model based on the calibration data. The model assessment is achieved using validation and accreditation experimental data respectively based on the probability model. Finally, the target structure prediction is performed using validated model, which are consistent with the results obtained by other researchers. The results demonstrate that the method using the improved confidence level and kernel density estimation is an effective approach to solve the model validation problem with small samples.展开更多
Background: Currently, the common and feasible way to estimate the most accurate forest biomass requires ground measurements and allometric models.Previous studies have been conducted on allometric equations developm...Background: Currently, the common and feasible way to estimate the most accurate forest biomass requires ground measurements and allometric models.Previous studies have been conducted on allometric equations development for estimating tree aboveground biomass(AGB) of tropical dipterocarp forests(TDFs) in Kalimantan(Indonesian Borneo).However, before the use of existing equations, a validation for the selection of the best allometric equation is required to assess the model bias and precision.This study aims at evaluating the validity of local and pantropical equations; developing new allometric equations for estimating tree AGB in TDFs of Kalimantan; and validating the new equations using independent datasets.Methods: We used 108 tree samples from destructive sampling to develop the allometric equations, with maximum tree diameter of 175 cm and another 109 samples from previous studies for validating our equations.We performed ordinary least squares linear regression to explore the relationship between the AGB and the predictor variables in the natural logarithmic form.Results: This study found that most of the existing local equations tended to be biased and imprecise, with mean relative error and mean absolute relative error more than 0.1 and 0.3, respectively.We developed new allometric equations for tree AGB estimation in the TDFs of Kalimantan.Through a validation using an independent dataset,we found that our equations were reliable in estimating tree AGB in TDF.The pantropical equation, which includes tree diameter, wood density and total height as predictor variables performed only slightly worse than our new models.Conclusions: Our equations improve the precision and reduce the bias of AGB estimates of TDFs.Local models developed from small samples tend to systematically bias.A validation of existing AGB models is essential before the use of the models.展开更多
In this paper, we investigate the estimation of semi-varying coefficient models when the nonlinear covariates are prone to measurement error. With the help of validation sampling, we propose two estimators of the para...In this paper, we investigate the estimation of semi-varying coefficient models when the nonlinear covariates are prone to measurement error. With the help of validation sampling, we propose two estimators of the parameter and the coefficient functions by combining dimension reduction and the profile likelihood methods without any error structure equation specification or error distribution assumption. We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the proposed estimators achieves the best convergence rate. Data-driven bandwidth selection methods are also discussed. Simulations are conducted to evaluate the finite sample property of the estimation methods proposed.展开更多
For multivariate failure time with auxiliary covariate information, an estimated pseudo-partial-likelihood estimator under the marginal hazard model with distinguishable baseline hazard has been proposed. However, the...For multivariate failure time with auxiliary covariate information, an estimated pseudo-partial-likelihood estimator under the marginal hazard model with distinguishable baseline hazard has been proposed. However, the asymptotic properties of the corresponding estimated cumulative hazard function have not been studied. In this paper, based on counting process martingale, we use the continuous mapping theorem and Lenglart inequality and prove the consistency of the estimated cumulative hazard function in estimated pseudo-partial-likelihood approach.展开更多
This paper studies nonparametric estimation of the regression function with surrogate outcome data under double-sampling designs, where a proxy response is observed for the full sample and the true response is observe...This paper studies nonparametric estimation of the regression function with surrogate outcome data under double-sampling designs, where a proxy response is observed for the full sample and the true response is observed on a validation set. A new estimation approach is proposed for estimating the regression function. The authors first estimate the regression function with a kernel smoother based on the validation subsample, and then improve the estimation by utilizing the information on the incomplete observations from the non-validation subsample and the surrogate of response from the full sample. Asymptotic normality of the proposed estimator is derived. The effectiveness of the proposed method is demonstrated via simulations.展开更多
基金Funding of Jiangsu Innovation Program for Graduate Education (CXZZ11_0193)NUAA Research Funding (NJ2010009)
文摘An improved method using kernel density estimation (KDE) and confidence level is presented for model validation with small samples. Decision making is a challenging problem because of input uncertainty and only small samples can be used due to the high costs of experimental measurements. However, model validation provides more confidence for decision makers when improving prediction accuracy at the same time. The confidence level method is introduced and the optimum sample variance is determined using a new method in kernel density estimation to increase the credibility of model validation. As a numerical example, the static frame model validation challenge problem presented by Sandia National Laboratories has been chosen. The optimum bandwidth is selected in kernel density estimation in order to build the probability model based on the calibration data. The model assessment is achieved using validation and accreditation experimental data respectively based on the probability model. Finally, the target structure prediction is performed using validated model, which are consistent with the results obtained by other researchers. The results demonstrate that the method using the improved confidence level and kernel density estimation is an effective approach to solve the model validation problem with small samples.
基金the GIZ-Forclime project, a bilateral project between Indonesia and German governments, for funding the field measurements
文摘Background: Currently, the common and feasible way to estimate the most accurate forest biomass requires ground measurements and allometric models.Previous studies have been conducted on allometric equations development for estimating tree aboveground biomass(AGB) of tropical dipterocarp forests(TDFs) in Kalimantan(Indonesian Borneo).However, before the use of existing equations, a validation for the selection of the best allometric equation is required to assess the model bias and precision.This study aims at evaluating the validity of local and pantropical equations; developing new allometric equations for estimating tree AGB in TDFs of Kalimantan; and validating the new equations using independent datasets.Methods: We used 108 tree samples from destructive sampling to develop the allometric equations, with maximum tree diameter of 175 cm and another 109 samples from previous studies for validating our equations.We performed ordinary least squares linear regression to explore the relationship between the AGB and the predictor variables in the natural logarithmic form.Results: This study found that most of the existing local equations tended to be biased and imprecise, with mean relative error and mean absolute relative error more than 0.1 and 0.3, respectively.We developed new allometric equations for tree AGB estimation in the TDFs of Kalimantan.Through a validation using an independent dataset,we found that our equations were reliable in estimating tree AGB in TDF.The pantropical equation, which includes tree diameter, wood density and total height as predictor variables performed only slightly worse than our new models.Conclusions: Our equations improve the precision and reduce the bias of AGB estimates of TDFs.Local models developed from small samples tend to systematically bias.A validation of existing AGB models is essential before the use of the models.
基金Supported by the National Natural Science Foundation of China(No.10871072,11171112 and 11101114)the Scientific Research Fund of Zhejiang Provincial Education Department(Grant No.Y201121276)the Doctoral Fund of Ministry of Education of China(200900076110001)
文摘In this paper, we investigate the estimation of semi-varying coefficient models when the nonlinear covariates are prone to measurement error. With the help of validation sampling, we propose two estimators of the parameter and the coefficient functions by combining dimension reduction and the profile likelihood methods without any error structure equation specification or error distribution assumption. We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the proposed estimators achieves the best convergence rate. Data-driven bandwidth selection methods are also discussed. Simulations are conducted to evaluate the finite sample property of the estimation methods proposed.
基金Supported by the National Natural Science Foundation of China (10771163)
文摘For multivariate failure time with auxiliary covariate information, an estimated pseudo-partial-likelihood estimator under the marginal hazard model with distinguishable baseline hazard has been proposed. However, the asymptotic properties of the corresponding estimated cumulative hazard function have not been studied. In this paper, based on counting process martingale, we use the continuous mapping theorem and Lenglart inequality and prove the consistency of the estimated cumulative hazard function in estimated pseudo-partial-likelihood approach.
基金This research is supported by the National Natural Science Foundation of the US under Grant No. DMS- 0906482.
文摘This paper studies nonparametric estimation of the regression function with surrogate outcome data under double-sampling designs, where a proxy response is observed for the full sample and the true response is observed on a validation set. A new estimation approach is proposed for estimating the regression function. The authors first estimate the regression function with a kernel smoother based on the validation subsample, and then improve the estimation by utilizing the information on the incomplete observations from the non-validation subsample and the surrogate of response from the full sample. Asymptotic normality of the proposed estimator is derived. The effectiveness of the proposed method is demonstrated via simulations.