By exponentiating each of the components of a finite mixture of two exponential components model by a positive parameter, several shapes of hazard rate functions are obtained. Maximum likelihood and Bayes methods, bas...By exponentiating each of the components of a finite mixture of two exponential components model by a positive parameter, several shapes of hazard rate functions are obtained. Maximum likelihood and Bayes methods, based on square error loss function and objective prior, are used to obtain estimators based on balanced square error loss function for the parameters, survival and hazard rate functions of a mixture of two exponentiated exponential components model. Approximate interval estimators of the parameters of the model are obtained.展开更多
By taking the subsequence out of the input-output sequence of a system polluted by white noise, an independent observation sequence and its probability density are obtained and then a maximum likelihood estimation of ...By taking the subsequence out of the input-output sequence of a system polluted by white noise, an independent observation sequence and its probability density are obtained and then a maximum likelihood estimation of the identification parameters is given. In order to decrease the asymptotic error, a corrector of maximum likelihood (CML) estimation with its recursive algorithm is given. It has been proved that the corrector has smaller asymptotic error than the least square methods. A simulation example shows that the corrector of maximum likelihood estimation is of higher approximating precision to the true parameters than the least square methods.展开更多
In the constant-stress accelerated life test, estimation issues are discussed for a generalized half-normal distribution under a log-linear life-stress model. The maximum likelihood estimates with the corresponding fi...In the constant-stress accelerated life test, estimation issues are discussed for a generalized half-normal distribution under a log-linear life-stress model. The maximum likelihood estimates with the corresponding fixed point type iterative algorithm for unknown parameters are presented, and the least square estimates of the parameters are also proposed. Meanwhile, confidence intervals of model parameters are constructed by using the asymptotic theory and bootstrap technique. Numerical illustration is given to investigate the performance of our methods.展开更多
In the presence of multicollinearity in logistic regression, the variance of the Maximum Likelihood Estimator (MLE) becomes inflated. Siray et al. (2015) [1] proposed a restricted Liu estimator in logistic regression ...In the presence of multicollinearity in logistic regression, the variance of the Maximum Likelihood Estimator (MLE) becomes inflated. Siray et al. (2015) [1] proposed a restricted Liu estimator in logistic regression model with exact linear restrictions. However, there are some situations, where the linear restrictions are stochastic. In this paper, we propose a Stochastic Restricted Maximum Likelihood Estimator (SRMLE) for the logistic regression model with stochastic linear restrictions to overcome this issue. Moreover, a Monte Carlo simulation is conducted for comparing the performances of the MLE, Restricted Maximum Likelihood Estimator (RMLE), Ridge Type Logistic Estimator(LRE), Liu Type Logistic Estimator(LLE), and SRMLE for the logistic regression model by using Scalar Mean Squared Error (SMSE).展开更多
In reliability analysis,the stress-strength model is often used to describe the life of a component which has a random strength(X)and is subjected to a random stress(Y).In this paper,we considered the problem of estim...In reliability analysis,the stress-strength model is often used to describe the life of a component which has a random strength(X)and is subjected to a random stress(Y).In this paper,we considered the problem of estimating the reliability𝑅𝑅=P[Y<X]when the distributions of both stress and strength are independent and follow exponentiated Pareto distribution.The maximum likelihood estimator of the stress strength reliability is calculated under simple random sample,ranked set sampling and median ranked set sampling methods.Four different reliability estimators under median ranked set sampling are derived.Two estimators are obtained when both strength and stress have an odd or an even set size.The two other estimators are obtained when the strength has an odd size and the stress has an even set size and vice versa.The performances of the suggested estimators are compared with their competitors under simple random sample via a simulation study.The simulation study revealed that the stress strength reliability estimates based on ranked set sampling and median ranked set sampling are more efficient than their competitors via simple random sample.In general,the stress strength reliability estimates based on median ranked set sampling are smaller than the corresponding estimates under ranked set sampling and simple random sample methods.Keywords:Stress-Strength model,ranked set sampling,median ranked set sampling,maximum likelihood estimation,mean square error.corresponding estimates under ranked set sampling and simple random sample methods.展开更多
Estimation of the unknown mean, μ and variance, σ2 of a univariate Gaussian distribution given a single study variable x is considered. We propose an approach that does not require initialization of the sufficient u...Estimation of the unknown mean, μ and variance, σ2 of a univariate Gaussian distribution given a single study variable x is considered. We propose an approach that does not require initialization of the sufficient unknown distribution parameters. The approach is motivated by linearizing the Gaussian distribution through differential techniques, and estimating, μ and σ2 as regression coefficients using the ordinary least squares method. Two simulated datasets on hereditary traits and morphometric analysis of housefly strains are used to evaluate the proposed method (PM), the maximum likelihood estimation (MLE), and the method of moments (MM). The methods are evaluated by re-estimating the required Gaussian parameters on both large and small samples. The root mean squared error (RMSE), mean error (ME), and the standard deviation (SD) are used to assess the accuracy of the PM and MLE;confidence intervals (CIs) are also constructed for the ME estimate. The PM compares well with both the MLE and MM approaches as they all produce estimates whose errors have good asymptotic properties, also small CIs are observed for the ME using the PM and MLE. The PM can be used symbiotically with the MLE to provide initial approximations at the expectation maximization step.展开更多
In this paper, we propose a log-normal linear model whose errors are first-order correlated, and suggest a two-stage method for the efficient estimation of the conditional mean of the response variable at the original...In this paper, we propose a log-normal linear model whose errors are first-order correlated, and suggest a two-stage method for the efficient estimation of the conditional mean of the response variable at the original scale. We obtain two estimators which minimize the asymptotic mean squared error (MM) and the asymptotic bias (MB), respectively. Both the estimators are very easy to implement, and simulation studies show that they are perform better.展开更多
In this paper, the estimators of the scale parameter of the exponential distribution obtained by applying four methods, using complete data, are critically examined and compared. These methods are the Maximum Likeliho...In this paper, the estimators of the scale parameter of the exponential distribution obtained by applying four methods, using complete data, are critically examined and compared. These methods are the Maximum Likelihood Estimator (MLE), the Square-Error Loss Function (BSE), the Entropy Loss Function (BEN) and the Composite LINEX Loss Function (BCL). The performance of these four methods was compared based on three criteria: the Mean Square Error (MSE), the Akaike Information Criterion (AIC), and the Bayesian Information Criterion (BIC). Using Monte Carlo simulation based on relevant samples, the comparisons in this study suggest that the Bayesian method is better than the maximum likelihood estimator with respect to the estimation of the parameter that offers the smallest values of MSE, AIC, and BIC. Confidence intervals were then assessed to test the performance of the methods by comparing the 95% CI and average lengths (AL) for all estimation methods, showing that the Bayesian methods still offer the best performance in terms of generating the smallest ALs.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
A Norton-Rice distribution(NRD)is a versatile,flexible distribution for k ordered distances from a random location to the k nearest objects.In a context of plotless density estimation(PDE)with n randomly chosen sample...A Norton-Rice distribution(NRD)is a versatile,flexible distribution for k ordered distances from a random location to the k nearest objects.In a context of plotless density estimation(PDE)with n randomly chosen sample locations,and distances measured to the k=6 nearest objects,the NRD provided a good fit to distance data from seven populations with a census of forest tree stem locations.More importantly,the three parameters of a NRD followed a simple trend with the order(1,…,6)of observed distances.The trend is quantified and exploited in a proposed new PDE through a joint maximum likelihood estimation of the NRD parameters expressed as a functions of distance order.In simulated probability sampling from the seven populations,the proposed PDE had the lowest overall bias with a good performance potential when compared to three alternative PDEs.However,absolute bias increased by 0.8 percentage points when sample size decreased from 20 to 10.In terms of root mean squared error(RMSE),the new proposed estimator was at par with an estimator published in Ecology when this study was wrapping up,but otherwise superior to the remaining two investigated PDEs.Coverage of nominal 95%confidence intervals averaged 0.94 for the new proposed estimators and 0.90,0.96,and 0.90 for the comparison PDEs.Despite tangible improvements in PDEs over the last decades,a globally least biased PDE remains elusive.展开更多
Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed dat...Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.展开更多
文摘By exponentiating each of the components of a finite mixture of two exponential components model by a positive parameter, several shapes of hazard rate functions are obtained. Maximum likelihood and Bayes methods, based on square error loss function and objective prior, are used to obtain estimators based on balanced square error loss function for the parameters, survival and hazard rate functions of a mixture of two exponentiated exponential components model. Approximate interval estimators of the parameters of the model are obtained.
文摘By taking the subsequence out of the input-output sequence of a system polluted by white noise, an independent observation sequence and its probability density are obtained and then a maximum likelihood estimation of the identification parameters is given. In order to decrease the asymptotic error, a corrector of maximum likelihood (CML) estimation with its recursive algorithm is given. It has been proved that the corrector has smaller asymptotic error than the least square methods. A simulation example shows that the corrector of maximum likelihood estimation is of higher approximating precision to the true parameters than the least square methods.
基金supported by the National Natural Science Foundation of China(1150143371473187)the Natural Science Basic Research Plan in Shaanxi Province of China(2016JQ1014)
文摘In the constant-stress accelerated life test, estimation issues are discussed for a generalized half-normal distribution under a log-linear life-stress model. The maximum likelihood estimates with the corresponding fixed point type iterative algorithm for unknown parameters are presented, and the least square estimates of the parameters are also proposed. Meanwhile, confidence intervals of model parameters are constructed by using the asymptotic theory and bootstrap technique. Numerical illustration is given to investigate the performance of our methods.
文摘In the presence of multicollinearity in logistic regression, the variance of the Maximum Likelihood Estimator (MLE) becomes inflated. Siray et al. (2015) [1] proposed a restricted Liu estimator in logistic regression model with exact linear restrictions. However, there are some situations, where the linear restrictions are stochastic. In this paper, we propose a Stochastic Restricted Maximum Likelihood Estimator (SRMLE) for the logistic regression model with stochastic linear restrictions to overcome this issue. Moreover, a Monte Carlo simulation is conducted for comparing the performances of the MLE, Restricted Maximum Likelihood Estimator (RMLE), Ridge Type Logistic Estimator(LRE), Liu Type Logistic Estimator(LLE), and SRMLE for the logistic regression model by using Scalar Mean Squared Error (SMSE).
文摘In reliability analysis,the stress-strength model is often used to describe the life of a component which has a random strength(X)and is subjected to a random stress(Y).In this paper,we considered the problem of estimating the reliability𝑅𝑅=P[Y<X]when the distributions of both stress and strength are independent and follow exponentiated Pareto distribution.The maximum likelihood estimator of the stress strength reliability is calculated under simple random sample,ranked set sampling and median ranked set sampling methods.Four different reliability estimators under median ranked set sampling are derived.Two estimators are obtained when both strength and stress have an odd or an even set size.The two other estimators are obtained when the strength has an odd size and the stress has an even set size and vice versa.The performances of the suggested estimators are compared with their competitors under simple random sample via a simulation study.The simulation study revealed that the stress strength reliability estimates based on ranked set sampling and median ranked set sampling are more efficient than their competitors via simple random sample.In general,the stress strength reliability estimates based on median ranked set sampling are smaller than the corresponding estimates under ranked set sampling and simple random sample methods.Keywords:Stress-Strength model,ranked set sampling,median ranked set sampling,maximum likelihood estimation,mean square error.corresponding estimates under ranked set sampling and simple random sample methods.
文摘Estimation of the unknown mean, μ and variance, σ2 of a univariate Gaussian distribution given a single study variable x is considered. We propose an approach that does not require initialization of the sufficient unknown distribution parameters. The approach is motivated by linearizing the Gaussian distribution through differential techniques, and estimating, μ and σ2 as regression coefficients using the ordinary least squares method. Two simulated datasets on hereditary traits and morphometric analysis of housefly strains are used to evaluate the proposed method (PM), the maximum likelihood estimation (MLE), and the method of moments (MM). The methods are evaluated by re-estimating the required Gaussian parameters on both large and small samples. The root mean squared error (RMSE), mean error (ME), and the standard deviation (SD) are used to assess the accuracy of the PM and MLE;confidence intervals (CIs) are also constructed for the ME estimate. The PM compares well with both the MLE and MM approaches as they all produce estimates whose errors have good asymptotic properties, also small CIs are observed for the ME using the PM and MLE. The PM can be used symbiotically with the MLE to provide initial approximations at the expectation maximization step.
基金The NSF(11271155) of ChinaResearch Fund(20070183023) for the Doctoral Program of Higher Education
文摘In this paper, we propose a log-normal linear model whose errors are first-order correlated, and suggest a two-stage method for the efficient estimation of the conditional mean of the response variable at the original scale. We obtain two estimators which minimize the asymptotic mean squared error (MM) and the asymptotic bias (MB), respectively. Both the estimators are very easy to implement, and simulation studies show that they are perform better.
基金Supported by the National High Technology Research and Development Programme of China (No. 2009AA011501), National Basic Research Program of China (No. 2007CB310608), the Fundamental Research Funds for the Central Universities in China, and China Postdoctoral Science Foundation funded project.
文摘In this paper, the estimators of the scale parameter of the exponential distribution obtained by applying four methods, using complete data, are critically examined and compared. These methods are the Maximum Likelihood Estimator (MLE), the Square-Error Loss Function (BSE), the Entropy Loss Function (BEN) and the Composite LINEX Loss Function (BCL). The performance of these four methods was compared based on three criteria: the Mean Square Error (MSE), the Akaike Information Criterion (AIC), and the Bayesian Information Criterion (BIC). Using Monte Carlo simulation based on relevant samples, the comparisons in this study suggest that the Bayesian method is better than the maximum likelihood estimator with respect to the estimation of the parameter that offers the smallest values of MSE, AIC, and BIC. Confidence intervals were then assessed to test the performance of the methods by comparing the 95% CI and average lengths (AL) for all estimation methods, showing that the Bayesian methods still offer the best performance in terms of generating the smallest ALs.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
基金The work was supported by the Canadian Forest Service.
文摘A Norton-Rice distribution(NRD)is a versatile,flexible distribution for k ordered distances from a random location to the k nearest objects.In a context of plotless density estimation(PDE)with n randomly chosen sample locations,and distances measured to the k=6 nearest objects,the NRD provided a good fit to distance data from seven populations with a census of forest tree stem locations.More importantly,the three parameters of a NRD followed a simple trend with the order(1,…,6)of observed distances.The trend is quantified and exploited in a proposed new PDE through a joint maximum likelihood estimation of the NRD parameters expressed as a functions of distance order.In simulated probability sampling from the seven populations,the proposed PDE had the lowest overall bias with a good performance potential when compared to three alternative PDEs.However,absolute bias increased by 0.8 percentage points when sample size decreased from 20 to 10.In terms of root mean squared error(RMSE),the new proposed estimator was at par with an estimator published in Ecology when this study was wrapping up,but otherwise superior to the remaining two investigated PDEs.Coverage of nominal 95%confidence intervals averaged 0.94 for the new proposed estimators and 0.90,0.96,and 0.90 for the comparison PDEs.Despite tangible improvements in PDEs over the last decades,a globally least biased PDE remains elusive.
文摘Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.