Let X be a d-dimensional random vector with unknown density function f(z) = f (z1, ..., z(d)), and let f(n) be teh nearest neighbor estimator of f proposed by Loftsgaarden and Quesenberry (1965). In this paper, we est...Let X be a d-dimensional random vector with unknown density function f(z) = f (z1, ..., z(d)), and let f(n) be teh nearest neighbor estimator of f proposed by Loftsgaarden and Quesenberry (1965). In this paper, we established the law of the iterated logarithm of f(n) for general case of d greater-than-or-equal-to 1, which gives the exact pointwise strong convergence rate of f(n).展开更多
In this paper,Edgeworth expansion for the nearest neighbor\|kernel estimate and random weighting approximation of conditional density are given and the consistency and convergence rate are proved.
Background: Tree species recognition is the main bottleneck in remote sensing based inventories aiming to produce an input for species-specific growth and yield models. We hypothesized that a stratification of the ta...Background: Tree species recognition is the main bottleneck in remote sensing based inventories aiming to produce an input for species-specific growth and yield models. We hypothesized that a stratification of the target data according to the dominant species could improve the subsequent predictions of species-specific attributes in particular in study areas strongly dominated by certain species. Methods: We tested this hypothesis and an operational potential to improve the predictions of timber volumes, stratified to Scots pine, Norway spruce and deciduous trees, in a conifer forest dominated by the pine species. We derived predictor features from airborne laser scanning (ALS) data and used Most Similar Neighbor (MSN) and Seemingly Unrelated Regression (SUR) as examples of non-parametric and parametric prediction methods, respectively Results: The relationships between the ALS features and the volumes of the aforementioned species were considerably different depending on the dominant species. Incorporating the observed dominant species inthe predictions improved the root mean squared errors by 13.3-16.4 % and 12.6-28.9 % based on MSN and SUR, respectively, depending on the species. Predicting the dominant species based on a linear discriminant analysis had an overall accuracy of only 76 % at best, which degraded the accuracies of the predicted volumes. Consequently, the predictions that did not consider the dominant species were more accurate than those refined with the predicted species. The MSN method gave slightly better results than models fitted with SUR. Conclusions: According to our results, incorporating information on the dominant species has a clear potential to improve the subsequent predictions of species-specific forest attributes. Determining the dominant species based solely on ALS data is deemed challenging, but important in particular in areas where the species composition is otherwise seemingly homogeneous except being dominated by certain species.展开更多
This paper deals with the estimation in nonparametrio regression model.Sincethe conditional mean is sensitive to the tail behavior of the conditional distributionof the model,instead conditional median is considered.F...This paper deals with the estimation in nonparametrio regression model.Sincethe conditional mean is sensitive to the tail behavior of the conditional distributionof the model,instead conditional median is considered.For estimation of theconditional median,the sequence of the nearest neighbor estimators is shown to beasymptotio normal and consistent.展开更多
The nearest neighbor (n.n.) and its related methods are widely used in density and hazard function estimations. Even though the asymptotic normality of the n.n. density estimate is well known (see [1]), similar result...The nearest neighbor (n.n.) and its related methods are widely used in density and hazard function estimations. Even though the asymptotic normality of the n.n. density estimate is well known (see [1]), similar results for the n.n. hazard estimate have not been shown in the literature. In this paper, we develop a different approach to deal with the n.n. type estimator. For a mixed censorship-truneation model, we show that, under mild conditions, the n. n. estimate can be approximated by an estimate formed with a proper fixed bandwidth sequence and derive the asymptotic normality as a consequence.展开更多
In many medical studies,the prevalence of interval censored data is increasing due to periodic monitoring of the progression status of a disease.In nonparametric regression model,when the response variable is subjecte...In many medical studies,the prevalence of interval censored data is increasing due to periodic monitoring of the progression status of a disease.In nonparametric regression model,when the response variable is subjected to interval-censoring,the regression function could not be estimated by traditional methods directly.With the censored data,we construct a new response variable which has the same conditional expectation as the original one.Based on the new variable,we get a nearest neighbor estimator of the regression function.It is established that the estimator has strong consistency and asymptotic normality.The relevant simulation reports are given.展开更多
基金Research supported by National Natural Science Foundation of China.
文摘Let X be a d-dimensional random vector with unknown density function f(z) = f (z1, ..., z(d)), and let f(n) be teh nearest neighbor estimator of f proposed by Loftsgaarden and Quesenberry (1965). In this paper, we established the law of the iterated logarithm of f(n) for general case of d greater-than-or-equal-to 1, which gives the exact pointwise strong convergence rate of f(n).
文摘In this paper,Edgeworth expansion for the nearest neighbor\|kernel estimate and random weighting approximation of conditional density are given and the consistency and convergence rate are proved.
基金financed by the Finnish Funding Agency for Innovation(Tekes) and its business and research partners
文摘Background: Tree species recognition is the main bottleneck in remote sensing based inventories aiming to produce an input for species-specific growth and yield models. We hypothesized that a stratification of the target data according to the dominant species could improve the subsequent predictions of species-specific attributes in particular in study areas strongly dominated by certain species. Methods: We tested this hypothesis and an operational potential to improve the predictions of timber volumes, stratified to Scots pine, Norway spruce and deciduous trees, in a conifer forest dominated by the pine species. We derived predictor features from airborne laser scanning (ALS) data and used Most Similar Neighbor (MSN) and Seemingly Unrelated Regression (SUR) as examples of non-parametric and parametric prediction methods, respectively Results: The relationships between the ALS features and the volumes of the aforementioned species were considerably different depending on the dominant species. Incorporating the observed dominant species inthe predictions improved the root mean squared errors by 13.3-16.4 % and 12.6-28.9 % based on MSN and SUR, respectively, depending on the species. Predicting the dominant species based on a linear discriminant analysis had an overall accuracy of only 76 % at best, which degraded the accuracies of the predicted volumes. Consequently, the predictions that did not consider the dominant species were more accurate than those refined with the predicted species. The MSN method gave slightly better results than models fitted with SUR. Conclusions: According to our results, incorporating information on the dominant species has a clear potential to improve the subsequent predictions of species-specific forest attributes. Determining the dominant species based solely on ALS data is deemed challenging, but important in particular in areas where the species composition is otherwise seemingly homogeneous except being dominated by certain species.
文摘This paper deals with the estimation in nonparametrio regression model.Sincethe conditional mean is sensitive to the tail behavior of the conditional distributionof the model,instead conditional median is considered.For estimation of theconditional median,the sequence of the nearest neighbor estimators is shown to beasymptotio normal and consistent.
文摘The nearest neighbor (n.n.) and its related methods are widely used in density and hazard function estimations. Even though the asymptotic normality of the n.n. density estimate is well known (see [1]), similar results for the n.n. hazard estimate have not been shown in the literature. In this paper, we develop a different approach to deal with the n.n. type estimator. For a mixed censorship-truneation model, we show that, under mild conditions, the n. n. estimate can be approximated by an estimate formed with a proper fixed bandwidth sequence and derive the asymptotic normality as a consequence.
基金Supported by National Natural Science Foundation of China(Grant Nos.71001046,11171112,11101114,11201190)National Statistical Science Research Major Program of China(Grant No.2011LZ051)the Science Foundation of Education Department of Jiangxi Province(Grant No.Gjj11389)
文摘In many medical studies,the prevalence of interval censored data is increasing due to periodic monitoring of the progression status of a disease.In nonparametric regression model,when the response variable is subjected to interval-censoring,the regression function could not be estimated by traditional methods directly.With the censored data,we construct a new response variable which has the same conditional expectation as the original one.Based on the new variable,we get a nearest neighbor estimator of the regression function.It is established that the estimator has strong consistency and asymptotic normality.The relevant simulation reports are given.