Dear Editor, The main components of multi-view geometry and computer vision are robust pose estimation and feature matching. This letter discusses how to recover two-view geometry and match features between a pair of ...Dear Editor, The main components of multi-view geometry and computer vision are robust pose estimation and feature matching. This letter discusses how to recover two-view geometry and match features between a pair of images, and presents MCNet(a multiscale clustering network) as an algorithm for extracting multiscale features. It can identify the true inliers from the established putative correspondences, where outliers may degenerate the geometry estimation. In particular, the proposed MCNet is based on graph clustering.展开更多
The decomposition-based vector autoregressive model(DVAR) provides a new framework for scrutinizing the efficiency of technical analysis in forecasting stock returns. However, its relationships with other technical in...The decomposition-based vector autoregressive model(DVAR) provides a new framework for scrutinizing the efficiency of technical analysis in forecasting stock returns. However, its relationships with other technical indicators still remain unknown. This paper investigates the relationships of DVAR model with the Japanese Candlestick indicators using simulations, theoretical explanations and empirical studies. The main finding of this paper is that both lower and upper shadows in Japanese Candlestick Granger contribute to the DVAR model explanation power, and thus, providing useful information for improving the DVAR forecasts. This finding makes sense as it means that the information contained in the lower and upper shadows should be used when modeling the stock returns with DVAR. Empirical studies performed on China SSEC stock index demonstrate that DVAR model with upper and lower shadows as exogenous variables does have informative and valuable out-of-sample forecasts.展开更多
Let {X(t), t ≥ 0} be a centered stationary Gaussian process with correlation r(t)such that 1-r(t) is asymptotic to a regularly varying function. With T being a nonnegative random variable and independent of X(t), the...Let {X(t), t ≥ 0} be a centered stationary Gaussian process with correlation r(t)such that 1-r(t) is asymptotic to a regularly varying function. With T being a nonnegative random variable and independent of X(t), the exact asymptotics of P(sup_(t∈[0,T])X(t) > x) is considered, as x →∞.展开更多
The principal contradiction facing the Chinese society has evolved to be that between imbalanced and inadequate development and the people’s ever-growing needs for a better life.Given China’s vision for achieving mo...The principal contradiction facing the Chinese society has evolved to be that between imbalanced and inadequate development and the people’s ever-growing needs for a better life.Given China’s vision for achieving moderate prosperity,it is relevant to conduct theoretical and empirical studies on the nation’s development imbalances.As a quantitative index,the Tsinghua China Balanced Development Index measures the extent to which development is uneven and insuf ficient across regions,re flecting the progress and shortfalls in China’s efforts to promote balanced development.Our findings provide implications for how policymakers may help people’s expectations for a better life materialize by spurring balanced economic,social,environmental and livelihood development across regions.展开更多
In this paper, we propose a new estimation method for a nonparametric hidden Markov model(HMM), in which both the emission model and the transition matrix are nonparametric, and a semiparametric HMM, in which the tran...In this paper, we propose a new estimation method for a nonparametric hidden Markov model(HMM), in which both the emission model and the transition matrix are nonparametric, and a semiparametric HMM, in which the transition matrix is parametric while emission models are nonparametric. The estimation is based on a novel composite likelihood method, where the pairs of consecutive observations are treated as independent bivariate random variables. Therefore, the model is transformed into a mixture model, and a modified expectation-maximization(EM) algorithm is developed to compute the maximum composite likelihood.We systematically study asymptotic properties for both the nonparametric HMM and the semiparametric HMM. We also propose a generalized likelihood ratio test to choose between the nonparametric HMM and the semiparametric HMM. We derive the asymptotic distribution and prove the Wilk’s phenomenon of the proposed test statistics. Simulation studies and an application in volatility clustering analysis of the volatility index in the Chicago Board Options Exchange(CBOE) are conducted to demonstrate the effectiveness of the proposed methods.展开更多
This paper studies estimation of a partially specified spatial autoregressive model with heteroskedasticity error term. Under the assumption of exogenous regressors and exogenous spatial weighting matrix, the unknown ...This paper studies estimation of a partially specified spatial autoregressive model with heteroskedasticity error term. Under the assumption of exogenous regressors and exogenous spatial weighting matrix, the unknown parameter is estimated by applying the instrumental variable estimation. Under certain sufficient conditions, the proposed estimator for the finite dimensional parameters is shown to be root-n consistent and asymptotically normally distributed; The proposed estimator for the unknown function is shown to be consistent and asymptotically distributed as well, though at a rate slower than root-n. Consistent estimators for the asymptotic variance-covariance matrices of both estimators are provided. Monte Carlo simulations suggest that the proposed procedure has some practical value.展开更多
In this article,we consider a class of seemingly unrelated single-index regression models.By taking the contemporaneous correlation among equations into account we construct the weighted estimators(WEs)for unknown par...In this article,we consider a class of seemingly unrelated single-index regression models.By taking the contemporaneous correlation among equations into account we construct the weighted estimators(WEs)for unknown parameters of the coefficients and the improved local polynomial estimators for the unknown functions,respectively.We establish the asymptotic normalities of these estimators,and show both of them are more asymptotically efficient than those ignoring the contemporaneous correlation.The performances of the proposed procedures are evaluated through simulation studies.展开更多
Multivariate longitudinal data arise frequently in a variety of applications,where multiple outcomes are measured repeatedly from the same subject.In this paper,we first propose a two-stage weighted least square estim...Multivariate longitudinal data arise frequently in a variety of applications,where multiple outcomes are measured repeatedly from the same subject.In this paper,we first propose a two-stage weighted least square estimation procedure for the regression coefficients when the random error follows an irregular autoregressive(AR)process,and establish asymptotic normality properties for the resulting estimators.We then apply the smoothly clipped absolute deviation(SCAD)variable selection approach to determine the order of the AR error process.We further propose a test statistic to check whether multiple responses are correlated at the same observation time,and derive the asymptotic distribution of the proposed test statistic.Several simulated examples and real data analysis are presented to illustrate the finite-sample performance of the proposed method.展开更多
The outbreak of COVID-19 on the Diamond Princess cruise ship has attracted much attention.Motivated by the PCR testing data on the Diamond Princess,we propose a novel cure mixture nonparametric model to investigate th...The outbreak of COVID-19 on the Diamond Princess cruise ship has attracted much attention.Motivated by the PCR testing data on the Diamond Princess,we propose a novel cure mixture nonparametric model to investigate the detection pattern.It combines a logistic regression for the probability of susceptible subjects with a nonparametric distribution for the detection of infected individuals.Maximum likelihood estimators are proposed.The resulting estimators are shown to be consistent and asymptotically normal.Simulation studies demonstrate that the proposed approach is appropriate for practical use.Finally,we apply the proposed method to PCR testing data on the Diamond Princess to show its practical utility.展开更多
This article proposes a simple nonparametric estimator of quantile residual lifetime function under left-truncated and right-censored data. The asymptotic consistency and normality of this estimator are proved and the...This article proposes a simple nonparametric estimator of quantile residual lifetime function under left-truncated and right-censored data. The asymptotic consistency and normality of this estimator are proved and the variance expression is calculated. Two bootstrap procedures are employed in the simulation study,where the latter bootstrap from Zeng and Lin(2008) is 4000 times faster than the former naive one, and the numerical results in both methods show that our estimating approach works well. A real data example is used to illustrate its application.展开更多
This paper considers the monotonic transformation model with an unspecified transformation function and an unknown error function, and gives its monotone rank estimation with length-biased and rightcensored data. The ...This paper considers the monotonic transformation model with an unspecified transformation function and an unknown error function, and gives its monotone rank estimation with length-biased and rightcensored data. The estimator is shown to be√n-consistent and asymptotically normal. Numerical simulation studies reveal good finite sample performance and the estimator is illustrated with the Oscar data set. The variance can be estimated by a resampling method via perturbing the U-statistics objective function repeatedly.展开更多
Aims In ecology and conservation biology,the number of species counted in a biodiversity study is a key metric but is usually a biased underestimate of total species richness because many rare species are not detected...Aims In ecology and conservation biology,the number of species counted in a biodiversity study is a key metric but is usually a biased underestimate of total species richness because many rare species are not detected.Moreover,comparing species richness among sites or samples is a statistical challenge because the observed number of species is sensitive to the number of individuals counted or the area sampled.For individual-based data,we treat a single,empirical sample of species abundances from an investigator-defined species assemblage or community as a reference point for two estimation objectives under two sampling models:estimating the expected number of species(and its unconditional variance)in a random sample of(i)a smaller number of individuals(multinomial model)or a smaller area sampled(Poisson model)and(ii)a larger number of individuals or a larger area sampled.For sample-based incidence(presence–absence)data,under a Bernoulli product model,we treat a single set of species incidence frequencies as the reference point to estimate richness for smaller and larger numbers of sampling units.Methods The first objective is a problem in interpolation that we address with classical rarefaction(multinomial model)and Coleman rarefaction(Poisson model)for individual-based data and with sample-based rarefaction(Bernoulli product model)for incidence frequencies.The second is a problem in extrapolation that we address with sampling-theoretic predictors for the number of species in a larger sample(multinomial model),a larger area(Poisson model)or a larger number of sampling units(Bernoulli product model),based on an estimate of asymptotic species richness.Although published methods exist for many of these objectives,we bring them together here with some new estimators under a unified statistical and notational framework.This novel integration of mathematically distinct approaches allowed us to link interpolated(rarefaction)curves and extrapolated curves to plot a unified species accumulation curve for empirical examples.We provide new,unconditional variance estimators for classical,individual-based rarefaction and for Coleman rarefaction,long missing from the toolkit of biodiversity measurement.We illustrate these methods with datasets for tropical beetles,tropical trees and tropical ants.Important Findings Surprisingly,for all datasets we examined,the interpolation(rarefaction)curve and the extrapolation curve meet smoothly at the reference sample,yielding a single curve.Moreover,curves representing 95%confidence intervals for interpolated and extrapolated richness estimates also meet smoothly,allowing rigorous statistical comparison of samples not only for rarefaction but also for extrapolated richness values.The confidence intervals widen as the extrapolation moves further beyond the reference sample,but the method gives reasonable results for extrapolations up to about double or triple the original abundance or area of the reference sample.We found that the multinomial and Poisson models produced indistinguishable results,in units of estimated species,for all estimators and datasets.For sample-based abundance data,which allows the comparison of all three models,the Bernoulli product model generally yields lower richness estimates for rarefied data than either the multinomial or the Poisson models because of the ubiquity of non-random spatial distributions in nature.展开更多
In this paper, the authors generalize the concept of asymptotically almost negatively associated random variables from the classic probability space to the upper expectation space. Within the framework, the authors pr...In this paper, the authors generalize the concept of asymptotically almost negatively associated random variables from the classic probability space to the upper expectation space. Within the framework, the authors prove some different types of Rosenthal's inequalities for sub-additive expectations. Finally, the authors prove a strong law of large numbers as the application of Rosenthal's inequalities.展开更多
We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a nonconcave regular...We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a nonconcave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n1/2), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance.Comprehensive simulation studies are carried out and an application is presented to examine the fnite-sample performance of the proposed procedures.展开更多
This paper considers the estimation of a Box-Cox transformation model with varying coefficient. A two-step approach is proposed in which the first step estimates the varying coefficients nonparametrically for any give...This paper considers the estimation of a Box-Cox transformation model with varying coefficient. A two-step approach is proposed in which the first step estimates the varying coefficients nonparametrically for any given parameter α in the transformation function. Then a one-dimensional search of α has been employed based on some least absolute deviation criterion function. The validity of our estimator does not require independence assumption thus is robust to the conditional heteroscedasticity. A simulation study shows a reasonably well finite sample performance. Additionally, a comprehensive empirical study has been carefully examined.展开更多
We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size.We use truncated series expansion based on polyn...We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size.We use truncated series expansion based on polynomial splines to approximate the nonparametric component in this model.Under a sparsity assumption on the regression coefficients of the linear component and some regularity conditions,we derive the oracle inequalities for the prediction risk and the estimation error.We also provide sufficient conditions under which the Lasso estimator is selection consistent for the variables in the linear part of the model.In addition,we derive the rate of convergence of the estimator of the nonparametric function.We conduct simulation studies to evaluate the finite sample performance of variable selection and nonparametric function estimation.展开更多
Length-biased data arise in many important fields, including epidemiological cohort studies, cancer screening trials and labor economics. Analysis of such data has attracted much attention in the literature. In this p...Length-biased data arise in many important fields, including epidemiological cohort studies, cancer screening trials and labor economics. Analysis of such data has attracted much attention in the literature. In this paper we propose a quantile regression approach for analyzing right-censored and length-biased data. We derive an inverse probability weighted estimating equation corresponding to the quantile regression to correct the bias due to length-bias sampling and informative censoring. This method can easily handle informative censoring induced by length-biased sampling. This is an appealing feature of our proposed method since it is generally difficult to obtain unbiased estimates of risk factors in the presence of length-bias and informative censoring. We establish the consistency and asymptotic distribution of the proposed estimator using empirical process techniques. A resampling method is adopted to estimate the variance of the estimator. We conduct simulation studies to evaluate its finite sample performance and use a real data set to illustrate the application of the proposed method.展开更多
基金supported by the National Natural Science Foundation of China(61703260,62173252)。
文摘Dear Editor, The main components of multi-view geometry and computer vision are robust pose estimation and feature matching. This letter discusses how to recover two-view geometry and match features between a pair of images, and presents MCNet(a multiscale clustering network) as an algorithm for extracting multiscale features. It can identify the true inliers from the established putative correspondences, where outliers may degenerate the geometry estimation. In particular, the proposed MCNet is based on graph clustering.
基金supported by the National Natural Science Foundation of China under Grant No.71401033
文摘The decomposition-based vector autoregressive model(DVAR) provides a new framework for scrutinizing the efficiency of technical analysis in forecasting stock returns. However, its relationships with other technical indicators still remain unknown. This paper investigates the relationships of DVAR model with the Japanese Candlestick indicators using simulations, theoretical explanations and empirical studies. The main finding of this paper is that both lower and upper shadows in Japanese Candlestick Granger contribute to the DVAR model explanation power, and thus, providing useful information for improving the DVAR forecasts. This finding makes sense as it means that the information contained in the lower and upper shadows should be used when modeling the stock returns with DVAR. Empirical studies performed on China SSEC stock index demonstrate that DVAR model with upper and lower shadows as exogenous variables does have informative and valuable out-of-sample forecasts.
基金Supported by the Scientific Research Fund of Sichuan Provincial Education Department(12ZB082)the Scientific research cultivation project of Sichuan University of Science&Engineering(2013PY07)+1 种基金the Scientific Research Fund of Shanghai University of Finance and Economics(2017110080)the Opening Project of Sichuan Province University Key Laboratory of Bridge Non-destruction Detecting and Engineering Computing(2018QZJ01)
文摘Let {X(t), t ≥ 0} be a centered stationary Gaussian process with correlation r(t)such that 1-r(t) is asymptotic to a regularly varying function. With T being a nonnegative random variable and independent of X(t), the exact asymptotics of P(sup_(t∈[0,T])X(t) > x) is considered, as x →∞.
基金the final result of the “Tsinghua China Balanced Development Index” Project of the China Data CenterTsinghua University+1 种基金Sponsored by the Minshan Public-Interest Fund of the China Siyuan Foundation for Poverty Alleviation (CSFPA) with special sponsorship from the China Post-Doctoral Science Foundation (2018T110079)general sponsorship from the China Post-Doctoral Science Foundation (2017M620719)。
文摘The principal contradiction facing the Chinese society has evolved to be that between imbalanced and inadequate development and the people’s ever-growing needs for a better life.Given China’s vision for achieving moderate prosperity,it is relevant to conduct theoretical and empirical studies on the nation’s development imbalances.As a quantitative index,the Tsinghua China Balanced Development Index measures the extent to which development is uneven and insuf ficient across regions,re flecting the progress and shortfalls in China’s efforts to promote balanced development.Our findings provide implications for how policymakers may help people’s expectations for a better life materialize by spurring balanced economic,social,environmental and livelihood development across regions.
基金Acknowledgements This work was funded by the general programme of the National Natural Science Foundation of China (71273136) and the Philosophy and Social Science Foundation of the Jiangsu Provincial Department of Education in 2013 of China (2013SJB6300087), and it was also sponsored by the Qing Lan Project of the Jiangsu Provincial Department of Education of China. We are thankful to Prof. Isabel de Felipe and Prof. Julian Briz of UPM for their valuable discussions.
基金Acknowledgement This article is funded by the National Natural Science Foundation of China (11161052), Guangxi Natural Science Foundation of China (201 ljjA10044) and Guangxi Education Hall Project (201012MS183)
基金supported by Shanghai Young Talent Development Program and Innovative Research Team of Shanghai University of Finance and Economics(Grant No.2020110930)supported by the Department of Energy of USA(Grant No.DE-EE0008574)。
文摘In this paper, we propose a new estimation method for a nonparametric hidden Markov model(HMM), in which both the emission model and the transition matrix are nonparametric, and a semiparametric HMM, in which the transition matrix is parametric while emission models are nonparametric. The estimation is based on a novel composite likelihood method, where the pairs of consecutive observations are treated as independent bivariate random variables. Therefore, the model is transformed into a mixture model, and a modified expectation-maximization(EM) algorithm is developed to compute the maximum composite likelihood.We systematically study asymptotic properties for both the nonparametric HMM and the semiparametric HMM. We also propose a generalized likelihood ratio test to choose between the nonparametric HMM and the semiparametric HMM. We derive the asymptotic distribution and prove the Wilk’s phenomenon of the proposed test statistics. Simulation studies and an application in volatility clustering analysis of the volatility index in the Chicago Board Options Exchange(CBOE) are conducted to demonstrate the effectiveness of the proposed methods.
基金Supported by the National Natural Science Foundation of China(Grant No.71371118,71471117,11101442,11471086)Foundation for Distinguished Young Talents in Higher Education of Guangdong(Grant No.LYM09011)+2 种基金Program for Changjiang Scholars and Innovative Research Team in University(PCSIRTIRT13077)the State Key Program of National Natural Science of China(Grant No.71331006)the Graduate Innovation Fund Project of Shanghai University of Finance and Economics(Grant No.CXJJ-2011-444)
文摘This paper studies estimation of a partially specified spatial autoregressive model with heteroskedasticity error term. Under the assumption of exogenous regressors and exogenous spatial weighting matrix, the unknown parameter is estimated by applying the instrumental variable estimation. Under certain sufficient conditions, the proposed estimator for the finite dimensional parameters is shown to be root-n consistent and asymptotically normally distributed; The proposed estimator for the unknown function is shown to be consistent and asymptotically distributed as well, though at a rate slower than root-n. Consistent estimators for the asymptotic variance-covariance matrices of both estimators are provided. Monte Carlo simulations suggest that the proposed procedure has some practical value.
基金Supported by the National Natural Science Foundation of China(No.11471140)
文摘In this article,we consider a class of seemingly unrelated single-index regression models.By taking the contemporaneous correlation among equations into account we construct the weighted estimators(WEs)for unknown parameters of the coefficients and the improved local polynomial estimators for the unknown functions,respectively.We establish the asymptotic normalities of these estimators,and show both of them are more asymptotically efficient than those ignoring the contemporaneous correlation.The performances of the proposed procedures are evaluated through simulation studies.
基金supported by the Fundamental Research Funds of Shandong University(Grant No.2018GN050)the Academic Prosperity Program provided by School of Economics,Shandong University and the Taishan Scholar Program of Shandong Province+2 种基金supported by National Natural Science Foundation of China(Grant No.11871323)the State Key Program in the Major Research Plan of National Natural Science Foundation of China(Grant No.91546202)Program for Innovative Research Team of Shanghai University of Finance and Economics。
文摘Multivariate longitudinal data arise frequently in a variety of applications,where multiple outcomes are measured repeatedly from the same subject.In this paper,we first propose a two-stage weighted least square estimation procedure for the regression coefficients when the random error follows an irregular autoregressive(AR)process,and establish asymptotic normality properties for the resulting estimators.We then apply the smoothly clipped absolute deviation(SCAD)variable selection approach to determine the order of the AR error process.We further propose a test statistic to check whether multiple responses are correlated at the same observation time,and derive the asymptotic distribution of the proposed test statistic.Several simulated examples and real data analysis are presented to illustrate the finite-sample performance of the proposed method.
基金the National Natural Science Foundation of China[grant numbers 71931004,11901200,71971083,and 11971170]the National Key R&D Program of China[grant numbers 2021YFA1000100,2021YFA1000101]。
文摘The outbreak of COVID-19 on the Diamond Princess cruise ship has attracted much attention.Motivated by the PCR testing data on the Diamond Princess,we propose a novel cure mixture nonparametric model to investigate the detection pattern.It combines a logistic regression for the probability of susceptible subjects with a nonparametric distribution for the detection of infected individuals.Maximum likelihood estimators are proposed.The resulting estimators are shown to be consistent and asymptotically normal.Simulation studies demonstrate that the proposed approach is appropriate for practical use.Finally,we apply the proposed method to PCR testing data on the Diamond Princess to show its practical utility.
基金supported by National Natural Science Foundation of China(Grant No.71271128)the State Key Program of National Natural Science Foundation of China(Grant No.71331006)+2 种基金NCMIS and Shanghai University of Finance and Economics through Project 211 Phase IVShanghai Firstclass Discipline A,Outstanding Ph D Dissertation Cultivation Funds of Shanghai University of Finance and EconomicsGraduate Education Innovation Funds of Shanghai University of Finance and Economics(Grant No.CXJJ-2011-438)
文摘This article proposes a simple nonparametric estimator of quantile residual lifetime function under left-truncated and right-censored data. The asymptotic consistency and normality of this estimator are proved and the variance expression is calculated. Two bootstrap procedures are employed in the simulation study,where the latter bootstrap from Zeng and Lin(2008) is 4000 times faster than the former naive one, and the numerical results in both methods show that our estimating approach works well. A real data example is used to illustrate its application.
基金supported by Graduate Innovation Foundation of Shanghai University of Finance and Economics(Grant No.CXJJ2013-451)Cultivation Foundation of Excellent Doctor Degree Dissertation of Shanghai University of Finance and Economics(Grant No.YBPY201504)+4 种基金Program of Educational Department of Fujian Province(Grant Nos.JA14079 and JA12060)Natural Science Foundation of Fujian Province(Grant Nos.2014J01001 and 2012J01028)National Natural Science Foundation of China(Grant No.71271128)the State Key Program of National Natural Science Foundation of China(Grant No.71331006)National Center for Mathematics and Interdisciplinary Sciences,Key Laboratory of Random Complex Structures and Data Science,Chinese Academy of Sciences and Shanghai First-class Discipline A and Innovative Research Team of Shanghai University of Finance and Economics,Program for Changjiang Scholars Innovative Research Team of Ministry of Education(Grant No.IRT13077)
文摘This paper considers the monotonic transformation model with an unspecified transformation function and an unknown error function, and gives its monotone rank estimation with length-biased and rightcensored data. The estimator is shown to be√n-consistent and asymptotically normal. Numerical simulation studies reveal good finite sample performance and the estimator is illustrated with the Oscar data set. The variance can be estimated by a resampling method via perturbing the U-statistics objective function repeatedly.
基金US National Science Foundation(DEB 0639979 and DBI 0851245 to R.K.C.DEB-0541936 to N.J.G.+4 种基金DEB-0424767 and DEB-0639393 to R.L.C.DEB-0640015 to J.T.L.)the US Department of Energy(022821 to N.J.G.)the Taiwan National Science Council(97-2118-M007-MY3 to A.C.)and the University of Connecticut Research Foundation(to R.L.C.).
文摘Aims In ecology and conservation biology,the number of species counted in a biodiversity study is a key metric but is usually a biased underestimate of total species richness because many rare species are not detected.Moreover,comparing species richness among sites or samples is a statistical challenge because the observed number of species is sensitive to the number of individuals counted or the area sampled.For individual-based data,we treat a single,empirical sample of species abundances from an investigator-defined species assemblage or community as a reference point for two estimation objectives under two sampling models:estimating the expected number of species(and its unconditional variance)in a random sample of(i)a smaller number of individuals(multinomial model)or a smaller area sampled(Poisson model)and(ii)a larger number of individuals or a larger area sampled.For sample-based incidence(presence–absence)data,under a Bernoulli product model,we treat a single set of species incidence frequencies as the reference point to estimate richness for smaller and larger numbers of sampling units.Methods The first objective is a problem in interpolation that we address with classical rarefaction(multinomial model)and Coleman rarefaction(Poisson model)for individual-based data and with sample-based rarefaction(Bernoulli product model)for incidence frequencies.The second is a problem in extrapolation that we address with sampling-theoretic predictors for the number of species in a larger sample(multinomial model),a larger area(Poisson model)or a larger number of sampling units(Bernoulli product model),based on an estimate of asymptotic species richness.Although published methods exist for many of these objectives,we bring them together here with some new estimators under a unified statistical and notational framework.This novel integration of mathematically distinct approaches allowed us to link interpolated(rarefaction)curves and extrapolated curves to plot a unified species accumulation curve for empirical examples.We provide new,unconditional variance estimators for classical,individual-based rarefaction and for Coleman rarefaction,long missing from the toolkit of biodiversity measurement.We illustrate these methods with datasets for tropical beetles,tropical trees and tropical ants.Important Findings Surprisingly,for all datasets we examined,the interpolation(rarefaction)curve and the extrapolation curve meet smoothly at the reference sample,yielding a single curve.Moreover,curves representing 95%confidence intervals for interpolated and extrapolated richness estimates also meet smoothly,allowing rigorous statistical comparison of samples not only for rarefaction but also for extrapolated richness values.The confidence intervals widen as the extrapolation moves further beyond the reference sample,but the method gives reasonable results for extrapolations up to about double or triple the original abundance or area of the reference sample.We found that the multinomial and Poisson models produced indistinguishable results,in units of estimated species,for all estimators and datasets.For sample-based abundance data,which allows the comparison of all three models,the Bernoulli product model generally yields lower richness estimates for rarefied data than either the multinomial or the Poisson models because of the ubiquity of non-random spatial distributions in nature.
基金supported by the National Natural Science Foundation of China(No.11601280)the Innovative Research Team of Shanghai University of Finance and Economics(No.13122402)
文摘In this paper, the authors generalize the concept of asymptotically almost negatively associated random variables from the classic probability space to the upper expectation space. Within the framework, the authors prove some different types of Rosenthal's inequalities for sub-additive expectations. Finally, the authors prove a strong law of large numbers as the application of Rosenthal's inequalities.
基金supported by National Institute on Drug Abuse(Grant Nos.R21-DA024260 and P50-DA10075)National Natural Science Foundation of China(Grant Nos.11071077,11371236,11028103,11071022 and 11028103)+2 种基金Innovation Program of Shanghai Municipal Education CommissionPujiang Project of Science and Technology Commission of Shanghai Municipality(Grant No.12PJ1403200)Program for New Century Excellent Talents,Ministry of Education of China(Grant No.NCET-12-0901)
文摘We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a nonconcave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n1/2), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance.Comprehensive simulation studies are carried out and an application is presented to examine the fnite-sample performance of the proposed procedures.
基金supported by National Natural Science Foundation of China(Grant Nos.71171127,71471108 and 71601105)the Open Project Program in the Key Laboratory of Mathematical Economics(SUFE)(Grant No.201309KF02)+2 种基金Ministry of Education of the People’s Republic of Chinathe Program for Changjiang Scholars and Innovative Research Team in Shanghai University of Finance and Economicsthe Innovative Research Team of Econometrics in Shanghai Academy of Social Sciences
文摘This paper considers the estimation of a Box-Cox transformation model with varying coefficient. A two-step approach is proposed in which the first step estimates the varying coefficients nonparametrically for any given parameter α in the transformation function. Then a one-dimensional search of α has been employed based on some least absolute deviation criterion function. The validity of our estimator does not require independence assumption thus is robust to the conditional heteroscedasticity. A simulation study shows a reasonably well finite sample performance. Additionally, a comprehensive empirical study has been carefully examined.
文摘We study the properties of the Lasso in the high-dimensional partially linear model where the number of variables in the linear part can be greater than the sample size.We use truncated series expansion based on polynomial splines to approximate the nonparametric component in this model.Under a sparsity assumption on the regression coefficients of the linear component and some regularity conditions,we derive the oracle inequalities for the prediction risk and the estimation error.We also provide sufficient conditions under which the Lasso estimator is selection consistent for the variables in the linear part of the model.In addition,we derive the rate of convergence of the estimator of the nonparametric function.We conduct simulation studies to evaluate the finite sample performance of variable selection and nonparametric function estimation.
基金National Natural Science Funds for Distinguished Young Scholar (No. 70825004)Creative Research Groups of China (No. 10721101)+1 种基金Shanghai University of Finance and Economics Project 211 Phase ⅢShanghai Leading Academic Discipline Project (No. B803)
文摘Length-biased data arise in many important fields, including epidemiological cohort studies, cancer screening trials and labor economics. Analysis of such data has attracted much attention in the literature. In this paper we propose a quantile regression approach for analyzing right-censored and length-biased data. We derive an inverse probability weighted estimating equation corresponding to the quantile regression to correct the bias due to length-bias sampling and informative censoring. This method can easily handle informative censoring induced by length-biased sampling. This is an appealing feature of our proposed method since it is generally difficult to obtain unbiased estimates of risk factors in the presence of length-bias and informative censoring. We establish the consistency and asymptotic distribution of the proposed estimator using empirical process techniques. A resampling method is adopted to estimate the variance of the estimator. We conduct simulation studies to evaluate its finite sample performance and use a real data set to illustrate the application of the proposed method.