This paper investigates the modified likelihood ratio test(LRT) for homogeneity in normal mixtures of two samples with mixing proportions unknown. It is proved that the limit distribution of the modified likelihood ...This paper investigates the modified likelihood ratio test(LRT) for homogeneity in normal mixtures of two samples with mixing proportions unknown. It is proved that the limit distribution of the modified likelihood ratio test is X^2(1).展开更多
An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the ...An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.展开更多
This paper investigates the asymptotic properties of the modified likelihood ratio statistic for testing homogeneity in bivariate normal mixture models with an unknown structural parameter. It is shown that the modifi...This paper investigates the asymptotic properties of the modified likelihood ratio statistic for testing homogeneity in bivariate normal mixture models with an unknown structural parameter. It is shown that the modified likelihood ratio statistic has χ22 null limiting distribution.展开更多
This paper investigates the asymptotic properties of a modified likelihood ratio statistic for testing homogeneity in bivariate normal mixture models of two samples. The asymptotic null distribution of the modified li...This paper investigates the asymptotic properties of a modified likelihood ratio statistic for testing homogeneity in bivariate normal mixture models of two samples. The asymptotic null distribution of the modified likelihood ratio statistic is found to be X2^2, where X2^2 is a chi-squared distribution with 2 degrees of freedom.展开更多
In this paper, we propose a robust mixture regression model based on the skew scale mixtures of normal distributions (RMR-SSMN) which can accommodate asymmetric, heavy-tailed and contaminated data better. For the vari...In this paper, we propose a robust mixture regression model based on the skew scale mixtures of normal distributions (RMR-SSMN) which can accommodate asymmetric, heavy-tailed and contaminated data better. For the variable selection problem, the penalized likelihood approach with a new combined penalty function which balances the SCAD and l<sub>2</sub> penalty is proposed. The adjusted EM algorithm is presented to get parameter estimates of RMR-SSMN models at a faster convergence rate. As simulations show, our mixture models are more robust than general FMR models and the new combined penalty function outperforms SCAD for variable selection. Finally, the proposed methodology and algorithm are applied to a real data set and achieve reasonable results.展开更多
Joint location and scale models of the skew-normal distribution provide useful ex- tension for joint mean and variance models of the normal distribution when the data set under consideration involves asymmetric outcom...Joint location and scale models of the skew-normal distribution provide useful ex- tension for joint mean and variance models of the normal distribution when the data set under consideration involves asymmetric outcomes. This paper focuses on the maximum likelihood estimation of joint location and scale models of the skew-normal distribution. The proposed procedure can simultaneously estimate parameters in the location model and the scale model. Simulation studies and a real example are used to illustrate the proposed methodologies.展开更多
Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consideration involves asymmetric outcomes, in the last two decades...Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consideration involves asymmetric outcomes, in the last two decades, the skew normal distribution has been shown beneficial in dealing with asymmetric data in various theoretic and applied problems. In this paper, we propose and study a novel class of models: a skew-normal mixture of joint location, scale and skewness models to analyze the heteroscedastic skew-normal data coming from a heterogeneous population. The issues of maximum likelihood estimation are addressed. In particular, an Expectation-Maximization (EM) algorithm for estimating the model parameters is developed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo experiments. Results from the analysis of a real data set from the Body Mass Index (BMI) data are presented.展开更多
Generalized linear mixed models (GLMMs) are typically constructed by incorporating random effects into the linear predictor. The random effects are usually assumed to be normally distributed with mean zero and varianc...Generalized linear mixed models (GLMMs) are typically constructed by incorporating random effects into the linear predictor. The random effects are usually assumed to be normally distributed with mean zero and variance-covariance identity matrix. In this paper, we propose to release random effects to non-normal distributions and discuss how to model the mean and covariance structures in GLMMs simultaneously. Parameter estimation is solved by using Quasi-Monte Carlo (QMC) method through iterative Newton-Raphson (NR) algorithm very well in terms of accuracy and stabilization, which is demonstrated by real binary salamander mating data analysis and simulation studies.展开更多
This paper compares the stock return distribution models of mixture normal distribution, mixed diffusion-jump and GARCH models based on the data of Chinese stock market. The Schwarz criterion is also used. We find all...This paper compares the stock return distribution models of mixture normal distribution, mixed diffusion-jump and GARCH models based on the data of Chinese stock market. The Schwarz criterion is also used. We find all these models can capture the features of stock returns partly. EGARCH model is the best fitting to daily return and stable during different period. When the weekly and monthly returns are tested, the differences of the models' fitness become unobvious and unstable.展开更多
Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed dat...Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.展开更多
基金Supported by the National Natural Science Foundation of China(10661003)the SRF for ROCS,SEM([2004]527)the NSF of Guangxi(0728092)
文摘This paper investigates the modified likelihood ratio test(LRT) for homogeneity in normal mixtures of two samples with mixing proportions unknown. It is proved that the limit distribution of the modified likelihood ratio test is X^2(1).
基金The National Natural Science Foundation of China(No.61105048,60972165)the Doctoral Fund of Ministry of Education of China(No.20110092120034)+2 种基金the Natural Science Foundation of Jiangsu Province(No.BK2010240)the Technology Foundation for Selected Overseas Chinese Scholar,Ministry of Human Resources and Social Security of China(No.6722000008)the Open Fund of Jiangsu Province Key Laboratory for Remote Measuring and Control(No.YCCK201005)
文摘An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.
基金the National Natural Science Foundation of China (Grant No. 10661003)the Natural Science Foundation of Guangxi (Grant No. 0728092) SRF for ROCS, SEM (Grant No. [2004]527)
文摘This paper investigates the asymptotic properties of the modified likelihood ratio statistic for testing homogeneity in bivariate normal mixture models with an unknown structural parameter. It is shown that the modified likelihood ratio statistic has χ22 null limiting distribution.
基金supported by the National Natural Science Foundation of China under Grant No. 10661003SRF for ROCS, SEM under Grant No. [2004]527the Natm'aI Science Foundation of Guangxi under Grant No. 0728092
文摘This paper investigates the asymptotic properties of a modified likelihood ratio statistic for testing homogeneity in bivariate normal mixture models of two samples. The asymptotic null distribution of the modified likelihood ratio statistic is found to be X2^2, where X2^2 is a chi-squared distribution with 2 degrees of freedom.
文摘In this paper, we propose a robust mixture regression model based on the skew scale mixtures of normal distributions (RMR-SSMN) which can accommodate asymmetric, heavy-tailed and contaminated data better. For the variable selection problem, the penalized likelihood approach with a new combined penalty function which balances the SCAD and l<sub>2</sub> penalty is proposed. The adjusted EM algorithm is presented to get parameter estimates of RMR-SSMN models at a faster convergence rate. As simulations show, our mixture models are more robust than general FMR models and the new combined penalty function outperforms SCAD for variable selection. Finally, the proposed methodology and algorithm are applied to a real data set and achieve reasonable results.
基金Supported by the National Natural Science Foundation of China(11261025,11201412)the Natural Science Foundation of Yunnan Province(2011FB016)the Program for Middle-aged Backbone Teacher,Yunnan University
文摘Joint location and scale models of the skew-normal distribution provide useful ex- tension for joint mean and variance models of the normal distribution when the data set under consideration involves asymmetric outcomes. This paper focuses on the maximum likelihood estimation of joint location and scale models of the skew-normal distribution. The proposed procedure can simultaneously estimate parameters in the location model and the scale model. Simulation studies and a real example are used to illustrate the proposed methodologies.
基金Supported by the National Natural Science Foundation of China(11261025,11561075)the Natural Science Foundation of Yunnan Province(2016FB005)the Program for Middle-aged Backbone Teacher,Yunnan University
文摘Normal mixture regression models are one of the most important statistical data analysis tools in a heterogeneous population. When the data set under consideration involves asymmetric outcomes, in the last two decades, the skew normal distribution has been shown beneficial in dealing with asymmetric data in various theoretic and applied problems. In this paper, we propose and study a novel class of models: a skew-normal mixture of joint location, scale and skewness models to analyze the heteroscedastic skew-normal data coming from a heterogeneous population. The issues of maximum likelihood estimation are addressed. In particular, an Expectation-Maximization (EM) algorithm for estimating the model parameters is developed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo experiments. Results from the analysis of a real data set from the Body Mass Index (BMI) data are presented.
文摘Generalized linear mixed models (GLMMs) are typically constructed by incorporating random effects into the linear predictor. The random effects are usually assumed to be normally distributed with mean zero and variance-covariance identity matrix. In this paper, we propose to release random effects to non-normal distributions and discuss how to model the mean and covariance structures in GLMMs simultaneously. Parameter estimation is solved by using Quasi-Monte Carlo (QMC) method through iterative Newton-Raphson (NR) algorithm very well in terms of accuracy and stabilization, which is demonstrated by real binary salamander mating data analysis and simulation studies.
文摘This paper compares the stock return distribution models of mixture normal distribution, mixed diffusion-jump and GARCH models based on the data of Chinese stock market. The Schwarz criterion is also used. We find all these models can capture the features of stock returns partly. EGARCH model is the best fitting to daily return and stable during different period. When the weekly and monthly returns are tested, the differences of the models' fitness become unobvious and unstable.
文摘Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.