在面向大数据问题的应用领域中,由于现实世界的多样性和复杂性,经常会遇到大规模的多类别数据挖掘问题,传统的多分类方法一方面存在着超平面不平衡更新的问题,另一方面学习效率较低,对于复杂的多类别数据无法进行高效分类。针对这个问题...在面向大数据问题的应用领域中,由于现实世界的多样性和复杂性,经常会遇到大规模的多类别数据挖掘问题,传统的多分类方法一方面存在着超平面不平衡更新的问题,另一方面学习效率较低,对于复杂的多类别数据无法进行高效分类。针对这个问题,本文提出了一种改进的动态主动多分类(Dynamical active multiple classification,DYA)方法,该方法通过将死锁、激活等概念引入到主动多分类过程,在主动多分类过程中随着分类器的不断更新,动态地控制样本是否参与主动学习的过程;同时,采用分位计数、轮换学习方式的主动多分类方法,使得多类别的分类器能够得到平衡的学习和更新。实验结果表明,本文提出的动态主动多分类方法有效提高了模型的学习效率和泛化性能。展开更多
In order to provide important parameters for schedule designing, decision-making bases for transit operation management and references for passengers traveling by bus, bus transit travel time reliability is analyzed a...In order to provide important parameters for schedule designing, decision-making bases for transit operation management and references for passengers traveling by bus, bus transit travel time reliability is analyzed and evaluated based on automatic vehicle location (AVL) data. Based on the statistical analysis of the bus transit travel time, six indices including the coefficient of variance, the width of travel time distribution, the mean commercial speed, the congestion frequency, the planning time index and the buffer time index are proposed. Moreover, a framework for evaluating bus transit travel time reliability is constructed. Finally, a case study on a certain bus route in Suzhou is conducted. Results show that the proposed evaluation index system is simple and intuitive, and it can effectively reflect the efficiency and stability of bus operations. And a distinguishing feature of bus transit travel time reliability is the temporal pattern. It varies across different time periods.展开更多
This paper examines the effect on rates of achievement of the interaction of student gender and school socioeconomic status, using ordinary least squares and probit regressions. The data used is school achievement by ...This paper examines the effect on rates of achievement of the interaction of student gender and school socioeconomic status, using ordinary least squares and probit regressions. The data used is school achievement by students taking externally assessed accounting standards in their final year at New Zealand secondary schools, and covers the period 2004 to 2008. The paper concludes that the interaction of gender and school decile have a significant impact on achievement rates for Maori, Pacific Island and Asian girls relative to Maori, Pacific Island and Asian boys in low decile schools. A secondary contribution of this paper is to demonstrate that comparing the achievement of gender or socioeconomic status groups in isolation is insufficient when examining academic performance and evaluating subject curriculum. Interactions between variables need to be considered, whether they be gender and decile as this paper examines, or other variables not examined within this paper.展开更多
Terrain texture analysis is an important method of digital terrain analysis in quantitative geomorphological research and in the exploration of the spatial heterogeneity and autocorrelation of terrain features. Howeve...Terrain texture analysis is an important method of digital terrain analysis in quantitative geomorphological research and in the exploration of the spatial heterogeneity and autocorrelation of terrain features. However, a major issue often neglected in previous studies is the calculation unit of the terrain texture, that is, the stability analysis unit. As the test size increases, the derived terrain textures become increasingly similar so that their differences can be ignored. The test size of terrain texture is defined as the stability analysis unit. This study randomly selected 48 areas within the Loess Plateau in northern Shaanxi in China as the study sites and used the gray level co-occurrence matrix to calculate the terrain texture. The stability analysis unit of the terrain texture was then extracted, and its spatial distribution pattern in the Loess Plateau was studiedusing spatial interpolation method. Four terrain texture metrics, i.e., homogeneity, energy, correlation, and contrast, were extracted on the basis of the stability analysis unit, and the spatial variation patterns of these parameters were studied. Results showed that the spatial distribution pattern and the terrain texture metrics reflected a trend of high–low–high from north to south, which correlated with the spatial distribution of the landforms at the Loess Plateau. In addition, the terrain texture measures was significantly correlated with the terrain factors of gully density and slope, and this relationship showed that terrain texture measures based on the stability analysis unit could reflect the basic characteristics of terrain morphology. The stability analysis unit provided a reasonable analytical scale for terrain texture analysis and could be used as a measure of the regional topography to accurately describe basic terrain characteristics.展开更多
According to the biased angles provided by the bistatic sensors, the necessary condition of observability and Cramer-Rao low bounds for the bistatic system are derived and analyzed, respectively. Additionally, a dual ...According to the biased angles provided by the bistatic sensors, the necessary condition of observability and Cramer-Rao low bounds for the bistatic system are derived and analyzed, respectively. Additionally, a dual Kalman filter method is presented with the purpose of eliminating the effect of biased angles on the state variable estimation. Finally, Monte-Carlo simulations are conducted in the observable scenario. Simulation results show that the proposed theory holds true, and the dual Kalman filter method can estimate state variable and biased angles simultaneously. Furthermore, the estimated results can achieve their Cramer-Rao tow bounds.展开更多
The procedure of stratified double quartile ranked set sampling (SDQRSS) method is introduced to estimate the population mean. The SDQRSS is compared with the simple random sampling (SRS), stratified ranked set sa...The procedure of stratified double quartile ranked set sampling (SDQRSS) method is introduced to estimate the population mean. The SDQRSS is compared with the simple random sampling (SRS), stratified ranked set sampling (SRSS) and stratified simple random sampling (SSRS). It is shown that SDQRSS estimator is an unbiased of the population mean and more efficient than SRS, SRSS and SSRS for symmetric and asymmetric distributions. In addition, by SDQRSS we can increase the efficiency of mean estimator for specific value of the sample size.展开更多
An existing Bayesian flood frequency analysis method is applied to quantile estimation for Pearson type three (P-III) probability distribution. The method couples prior and sample information under the framework of Ba...An existing Bayesian flood frequency analysis method is applied to quantile estimation for Pearson type three (P-III) probability distribution. The method couples prior and sample information under the framework of Bayesian formula, and the Markov Chain Monte Carlo (MCMC) sampling approach is used to estimate posterior distributions of parameters. Different from the original sampling algorithm (i.e. the important sampling) used in the existing approach, we use the adaptive metropolis (AM) sampling technique to generate a large number of parameter sets from Bayesian parameter posterior distributions in this paper. Consequently, the sampling distributions for quantiles or the hydrological design values are constructed. The sampling distributions of quantiles are estimated as the Bayesian method can provide not only various kinds of point estimators for quantiles, e.g. the expectation estimator, but also quantitative evaluation on uncertainties of these point estimators. Therefore, the Bayesian method brings more useful information to hydrological frequency analysis. As an example, the flood extreme sample series at a gauge are used to demonstrate the procedure of application.展开更多
Structural equation model(SEM) is a multivariate analysis tool that has been widely applied to many fields such as biomedical and social sciences. In the traditional SEM, it is often assumed that random errors and exp...Structural equation model(SEM) is a multivariate analysis tool that has been widely applied to many fields such as biomedical and social sciences. In the traditional SEM, it is often assumed that random errors and explanatory latent variables follow the normal distribution, and the effect of explanatory latent variables on outcomes can be formulated by a mean regression-type structural equation. But this SEM may be inappropriate in some cases where random errors or latent variables are highly nonnormal. The authors develop a new SEM, called as quantile SEM(QSEM), by allowing for a quantile regression-type structural equation and without distribution assumption of random errors and latent variables. A Bayesian empirical likelihood(BEL) method is developed to simultaneously estimate parameters and latent variables based on the estimating equation method. A hybrid algorithm combining the Gibbs sampler and Metropolis-Hastings algorithm is presented to sample observations required for statistical inference. Latent variables are imputed by the estimated density function and the linear interpolation method. A simulation study and an example are presented to investigate the performance of the proposed methodologies.展开更多
This article proposes a simple nonparametric estimator of quantile residual lifetime function under left-truncated and right-censored data. The asymptotic consistency and normality of this estimator are proved and the...This article proposes a simple nonparametric estimator of quantile residual lifetime function under left-truncated and right-censored data. The asymptotic consistency and normality of this estimator are proved and the variance expression is calculated. Two bootstrap procedures are employed in the simulation study,where the latter bootstrap from Zeng and Lin(2008) is 4000 times faster than the former naive one, and the numerical results in both methods show that our estimating approach works well. A real data example is used to illustrate its application.展开更多
This paper studies estimation in partial functional linear quantile regression in which the dependent variable is related to both a vector of finite length and a function-valued random variable as predictor variables....This paper studies estimation in partial functional linear quantile regression in which the dependent variable is related to both a vector of finite length and a function-valued random variable as predictor variables. The slope function is estimated by the functional principal component basis. The asymptotic distribution of the estimator of the vector of slope parameters is derived and the global convergence rate of the quantile estimator of unknown slope function is established under suitable norm. It is showed that this rate is optirnal in a minimax sense under some smoothness assumptions on the covariance kernel of the covariate and the slope function. The convergence rate of the mean squared prediction error for the proposed estimators is also established. Finite sample properties of our procedures are studied through Monte Carlo simulations. A real data example about Berkeley growth data is used to illustrate our proposed methodology.展开更多
When dealing with regression analysis,heteroscedasticity is a problem that the authors have to face with.Especially if little information can be got in advance,detection of heteroscedasticity as well as estimation of ...When dealing with regression analysis,heteroscedasticity is a problem that the authors have to face with.Especially if little information can be got in advance,detection of heteroscedasticity as well as estimation of statistical models could be even more difficult.To this end,this paper proposes a quantile difference method(QDM) that can effectively estimate the heteroscedastic function.This method,being completely free from the estimation of mean regression function,is simple,robust and easy to implement.Moreover,the QDM method enables the detection of heteroscedasticity without any restrictions on error terms,consequently being widely applied.What is worth mentioning is that based on the proposed approach estimators of both mean regression function and heteroscedastic function can be obtained.In the end,the authors conduct some simulations to examine the performance of the proposed methods and use a real data to make an illustration.展开更多
This paper is concerned with the estimation of change-point in a binary response model with the assumption that the conditional median of the error term, given the explanatory variable, is zero. We construct an estima...This paper is concerned with the estimation of change-point in a binary response model with the assumption that the conditional median of the error term, given the explanatory variable, is zero. We construct an estimation of change-point based on the maximum score function and give its exponential convergence rate under some mild conditions.展开更多
We consider the periodic generalized autoregressive conditional heteroskedasticity(P-GARCH) process and propose a robust estimator by composite quantile regression. We study some useful properties about the P-GARCH mo...We consider the periodic generalized autoregressive conditional heteroskedasticity(P-GARCH) process and propose a robust estimator by composite quantile regression. We study some useful properties about the P-GARCH model. Under some mild conditions, we establish the asymptotic results of proposed estimator.The Monte Carlo simulation is presented to assess the performance of proposed estimator. Numerical study results show that our proposed estimation outperforms other existing methods for heavy tailed distributions.The proposed methodology is also illustrated by Va R on stock price data.展开更多
We study smoothed quantile estimator for a class of stationary processes. We obtain the convergency rates and the Bahadur representation, as well as the asymptotic normality for this estimator by the method of m-depen...We study smoothed quantile estimator for a class of stationary processes. We obtain the convergency rates and the Bahadur representation, as well as the asymptotic normality for this estimator by the method of m-dependent approximation. Our results can be used in the study of the estimation of value-at-risk(Va R) and applied to many time series which have important applications in econometrics.展开更多
It is of great interest to estimate quantile residual lifetime in medical science and many other fields. In survival analysis, Kaplan-Meier(K-M) estimator has been widely used to estimate the survival distribution. ...It is of great interest to estimate quantile residual lifetime in medical science and many other fields. In survival analysis, Kaplan-Meier(K-M) estimator has been widely used to estimate the survival distribution. However, it is well-known that the K-M estimator is not continuous, thus it can not always be used to calculate quantile residual lifetime. In this paper, the authors propose a kernel smoothing method to give an estimator of quantile residual lifetime. By using modern empirical process techniques, the consistency and the asymptotic normality of the proposed estimator are provided neatly.The authors also present the empirical small sample performances of the estimator. Deficiency is introduced to compare the performance of the proposed estimator with the naive unsmoothed estimator of the quantile residaul lifetime. Further simulation studies indicate that the proposed estimator performs very well.展开更多
文摘在面向大数据问题的应用领域中,由于现实世界的多样性和复杂性,经常会遇到大规模的多类别数据挖掘问题,传统的多分类方法一方面存在着超平面不平衡更新的问题,另一方面学习效率较低,对于复杂的多类别数据无法进行高效分类。针对这个问题,本文提出了一种改进的动态主动多分类(Dynamical active multiple classification,DYA)方法,该方法通过将死锁、激活等概念引入到主动多分类过程,在主动多分类过程中随着分类器的不断更新,动态地控制样本是否参与主动学习的过程;同时,采用分位计数、轮换学习方式的主动多分类方法,使得多类别的分类器能够得到平衡的学习和更新。实验结果表明,本文提出的动态主动多分类方法有效提高了模型的学习效率和泛化性能。
基金The Soft Science Research Project of Ministry of Housing and Urban-Rural Development of China (No. 2008-k5-14)
文摘In order to provide important parameters for schedule designing, decision-making bases for transit operation management and references for passengers traveling by bus, bus transit travel time reliability is analyzed and evaluated based on automatic vehicle location (AVL) data. Based on the statistical analysis of the bus transit travel time, six indices including the coefficient of variance, the width of travel time distribution, the mean commercial speed, the congestion frequency, the planning time index and the buffer time index are proposed. Moreover, a framework for evaluating bus transit travel time reliability is constructed. Finally, a case study on a certain bus route in Suzhou is conducted. Results show that the proposed evaluation index system is simple and intuitive, and it can effectively reflect the efficiency and stability of bus operations. And a distinguishing feature of bus transit travel time reliability is the temporal pattern. It varies across different time periods.
文摘This paper examines the effect on rates of achievement of the interaction of student gender and school socioeconomic status, using ordinary least squares and probit regressions. The data used is school achievement by students taking externally assessed accounting standards in their final year at New Zealand secondary schools, and covers the period 2004 to 2008. The paper concludes that the interaction of gender and school decile have a significant impact on achievement rates for Maori, Pacific Island and Asian girls relative to Maori, Pacific Island and Asian boys in low decile schools. A secondary contribution of this paper is to demonstrate that comparing the achievement of gender or socioeconomic status groups in isolation is insufficient when examining academic performance and evaluating subject curriculum. Interactions between variables need to be considered, whether they be gender and decile as this paper examines, or other variables not examined within this paper.
基金supported by the National Natural Science Foundation of China (Grant Nos. 41471316, 41571383, 41671389)the Priority Academic Program Development of Jiangsu Higher Education Institutions-PAPD (Grant No. 164320H101)the Key Project of Natural Science Research of Anhui Provincial Department of Education (Grant No. KJ2015A171)
文摘Terrain texture analysis is an important method of digital terrain analysis in quantitative geomorphological research and in the exploration of the spatial heterogeneity and autocorrelation of terrain features. However, a major issue often neglected in previous studies is the calculation unit of the terrain texture, that is, the stability analysis unit. As the test size increases, the derived terrain textures become increasingly similar so that their differences can be ignored. The test size of terrain texture is defined as the stability analysis unit. This study randomly selected 48 areas within the Loess Plateau in northern Shaanxi in China as the study sites and used the gray level co-occurrence matrix to calculate the terrain texture. The stability analysis unit of the terrain texture was then extracted, and its spatial distribution pattern in the Loess Plateau was studiedusing spatial interpolation method. Four terrain texture metrics, i.e., homogeneity, energy, correlation, and contrast, were extracted on the basis of the stability analysis unit, and the spatial variation patterns of these parameters were studied. Results showed that the spatial distribution pattern and the terrain texture metrics reflected a trend of high–low–high from north to south, which correlated with the spatial distribution of the landforms at the Loess Plateau. In addition, the terrain texture measures was significantly correlated with the terrain factors of gully density and slope, and this relationship showed that terrain texture measures based on the stability analysis unit could reflect the basic characteristics of terrain morphology. The stability analysis unit provided a reasonable analytical scale for terrain texture analysis and could be used as a measure of the regional topography to accurately describe basic terrain characteristics.
基金the Natural Science Foundation of Jiangsu Province, China (BK2004132).
文摘According to the biased angles provided by the bistatic sensors, the necessary condition of observability and Cramer-Rao low bounds for the bistatic system are derived and analyzed, respectively. Additionally, a dual Kalman filter method is presented with the purpose of eliminating the effect of biased angles on the state variable estimation. Finally, Monte-Carlo simulations are conducted in the observable scenario. Simulation results show that the proposed theory holds true, and the dual Kalman filter method can estimate state variable and biased angles simultaneously. Furthermore, the estimated results can achieve their Cramer-Rao tow bounds.
文摘The procedure of stratified double quartile ranked set sampling (SDQRSS) method is introduced to estimate the population mean. The SDQRSS is compared with the simple random sampling (SRS), stratified ranked set sampling (SRSS) and stratified simple random sampling (SSRS). It is shown that SDQRSS estimator is an unbiased of the population mean and more efficient than SRS, SRSS and SSRS for symmetric and asymmetric distributions. In addition, by SDQRSS we can increase the efficiency of mean estimator for specific value of the sample size.
基金supported by the National Basic Research Pro-gram of China ("973" Program) (Grant No. 2007CB714104)the National Natural Science Foundation of China (Grant No. 50779013)
文摘An existing Bayesian flood frequency analysis method is applied to quantile estimation for Pearson type three (P-III) probability distribution. The method couples prior and sample information under the framework of Bayesian formula, and the Markov Chain Monte Carlo (MCMC) sampling approach is used to estimate posterior distributions of parameters. Different from the original sampling algorithm (i.e. the important sampling) used in the existing approach, we use the adaptive metropolis (AM) sampling technique to generate a large number of parameter sets from Bayesian parameter posterior distributions in this paper. Consequently, the sampling distributions for quantiles or the hydrological design values are constructed. The sampling distributions of quantiles are estimated as the Bayesian method can provide not only various kinds of point estimators for quantiles, e.g. the expectation estimator, but also quantitative evaluation on uncertainties of these point estimators. Therefore, the Bayesian method brings more useful information to hydrological frequency analysis. As an example, the flood extreme sample series at a gauge are used to demonstrate the procedure of application.
基金supported by the National Natural Science Foundation of China under Grant No.11165016
文摘Structural equation model(SEM) is a multivariate analysis tool that has been widely applied to many fields such as biomedical and social sciences. In the traditional SEM, it is often assumed that random errors and explanatory latent variables follow the normal distribution, and the effect of explanatory latent variables on outcomes can be formulated by a mean regression-type structural equation. But this SEM may be inappropriate in some cases where random errors or latent variables are highly nonnormal. The authors develop a new SEM, called as quantile SEM(QSEM), by allowing for a quantile regression-type structural equation and without distribution assumption of random errors and latent variables. A Bayesian empirical likelihood(BEL) method is developed to simultaneously estimate parameters and latent variables based on the estimating equation method. A hybrid algorithm combining the Gibbs sampler and Metropolis-Hastings algorithm is presented to sample observations required for statistical inference. Latent variables are imputed by the estimated density function and the linear interpolation method. A simulation study and an example are presented to investigate the performance of the proposed methodologies.
基金supported by National Natural Science Foundation of China(Grant No.71271128)the State Key Program of National Natural Science Foundation of China(Grant No.71331006)+2 种基金NCMIS and Shanghai University of Finance and Economics through Project 211 Phase IVShanghai Firstclass Discipline A,Outstanding Ph D Dissertation Cultivation Funds of Shanghai University of Finance and EconomicsGraduate Education Innovation Funds of Shanghai University of Finance and Economics(Grant No.CXJJ-2011-438)
文摘This article proposes a simple nonparametric estimator of quantile residual lifetime function under left-truncated and right-censored data. The asymptotic consistency and normality of this estimator are proved and the variance expression is calculated. Two bootstrap procedures are employed in the simulation study,where the latter bootstrap from Zeng and Lin(2008) is 4000 times faster than the former naive one, and the numerical results in both methods show that our estimating approach works well. A real data example is used to illustrate its application.
基金supported by National Natural Science Foundation of China(Grant No.11071120)
文摘This paper studies estimation in partial functional linear quantile regression in which the dependent variable is related to both a vector of finite length and a function-valued random variable as predictor variables. The slope function is estimated by the functional principal component basis. The asymptotic distribution of the estimator of the vector of slope parameters is derived and the global convergence rate of the quantile estimator of unknown slope function is established under suitable norm. It is showed that this rate is optirnal in a minimax sense under some smoothness assumptions on the covariance kernel of the covariate and the slope function. The convergence rate of the mean squared prediction error for the proposed estimators is also established. Finite sample properties of our procedures are studied through Monte Carlo simulations. A real data example about Berkeley growth data is used to illustrate our proposed methodology.
基金supported by the National Natural Science Foundation of China under Grant No.11271368the Major Program of Beijing Philosophy and Social Science Foundation of China under Grant No.15ZDA17+3 种基金the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No.20130004110007the Key Program of National Philosophy and Social Science Foundation under Grant No.13AZD064the Fundamental Research Funds for the Central Universities,and the Research Funds of Renmin University of China under Grant No.15XNL008the Project of Flying Apsaras Scholar of Lanzhou University of Finance & Economics
文摘When dealing with regression analysis,heteroscedasticity is a problem that the authors have to face with.Especially if little information can be got in advance,detection of heteroscedasticity as well as estimation of statistical models could be even more difficult.To this end,this paper proposes a quantile difference method(QDM) that can effectively estimate the heteroscedastic function.This method,being completely free from the estimation of mean regression function,is simple,robust and easy to implement.Moreover,the QDM method enables the detection of heteroscedasticity without any restrictions on error terms,consequently being widely applied.What is worth mentioning is that based on the proposed approach estimators of both mean regression function and heteroscedastic function can be obtained.In the end,the authors conduct some simulations to examine the performance of the proposed methods and use a real data to make an illustration.
基金The research is partially supported by the National Natural Science Foundation of China under Grant No.10471136Ph.D.Program Foundation of the Ministry of Education of ChinaSpecial Foundations of the Chinese Academy of Sciences and University of Science and Technology of China.
文摘This paper is concerned with the estimation of change-point in a binary response model with the assumption that the conditional median of the error term, given the explanatory variable, is zero. We construct an estimation of change-point based on the maximum score function and give its exponential convergence rate under some mild conditions.
基金supported by National Natural Science Foundation of China(Grant No.11371354)Key Laboratory of Random Complex Structures and Data Science+2 种基金Chinese Academy of Sciences(Grant No.2008DP173182)National Center for Mathematics and Interdisciplinary SciencesChinese Academy of Sciences
文摘We consider the periodic generalized autoregressive conditional heteroskedasticity(P-GARCH) process and propose a robust estimator by composite quantile regression. We study some useful properties about the P-GARCH model. Under some mild conditions, we establish the asymptotic results of proposed estimator.The Monte Carlo simulation is presented to assess the performance of proposed estimator. Numerical study results show that our proposed estimation outperforms other existing methods for heavy tailed distributions.The proposed methodology is also illustrated by Va R on stock price data.
文摘We study smoothed quantile estimator for a class of stationary processes. We obtain the convergency rates and the Bahadur representation, as well as the asymptotic normality for this estimator by the method of m-dependent approximation. Our results can be used in the study of the estimation of value-at-risk(Va R) and applied to many time series which have important applications in econometrics.
基金supported by the National Natural Science Foundation of China under Grant No.71271128the State Key Program of National Natural Science Foundation of China under Grant No.71331006+4 种基金NCMISKey Laboratory of RCSDSCAS and IRTSHUFEPCSIRT(IRT13077)supported by Graduate Innovation Fund of Shanghai University of Finance and Economics under Grant No.CXJJ-2011-429
文摘It is of great interest to estimate quantile residual lifetime in medical science and many other fields. In survival analysis, Kaplan-Meier(K-M) estimator has been widely used to estimate the survival distribution. However, it is well-known that the K-M estimator is not continuous, thus it can not always be used to calculate quantile residual lifetime. In this paper, the authors propose a kernel smoothing method to give an estimator of quantile residual lifetime. By using modern empirical process techniques, the consistency and the asymptotic normality of the proposed estimator are provided neatly.The authors also present the empirical small sample performances of the estimator. Deficiency is introduced to compare the performance of the proposed estimator with the naive unsmoothed estimator of the quantile residaul lifetime. Further simulation studies indicate that the proposed estimator performs very well.