Iteration methods and their convergences of the maximum likelihoodestimator are discussed in this paper.We study Gauss-Newton method and give a set ofsufficient conditions for the convergence of asymptotic numerical s...Iteration methods and their convergences of the maximum likelihoodestimator are discussed in this paper.We study Gauss-Newton method and give a set ofsufficient conditions for the convergence of asymptotic numerical stability.The modifiedGauss-Newton method is also studied and the sufficient conditions of the convergence arepresented.Two numerical examples are given to illustrate our results.展开更多
This paper deals with the problems of consistency and strong consistency of the maximum likelihood estimators of the mean and variance of the drift fractional Brownian motions observed at discrete time instants. Both ...This paper deals with the problems of consistency and strong consistency of the maximum likelihood estimators of the mean and variance of the drift fractional Brownian motions observed at discrete time instants. Both the central limit theorem and the Berry-Ess′een bounds for these estimators are obtained by using the Stein’s method via Malliavin calculus.展开更多
By taking the subsequence out of the input-output sequence of a system polluted by white noise, an independent observation sequence and its probability density are obtained and then a maximum likelihood estimation of ...By taking the subsequence out of the input-output sequence of a system polluted by white noise, an independent observation sequence and its probability density are obtained and then a maximum likelihood estimation of the identification parameters is given. In order to decrease the asymptotic error, a corrector of maximum likelihood (CML) estimation with its recursive algorithm is given. It has been proved that the corrector has smaller asymptotic error than the least square methods. A simulation example shows that the corrector of maximum likelihood estimation is of higher approximating precision to the true parameters than the least square methods.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
According to the principle, “The failure data is the basis of software reliability analysis”, we built a software reliability expert system (SRES) by adopting the artificial intelligence technology. By reasoning out...According to the principle, “The failure data is the basis of software reliability analysis”, we built a software reliability expert system (SRES) by adopting the artificial intelligence technology. By reasoning out the conclusion from the fitting results of failure data of a software project, the SRES can recommend users “the most suitable model” as a software reliability measurement model. We believe that the SRES can overcome the inconsistency in applications of software reliability models well. We report investigation results of singularity and parameter estimation methods of experimental models in SRES.展开更多
In the constant-stress accelerated life test, estimation issues are discussed for a generalized half-normal distribution under a log-linear life-stress model. The maximum likelihood estimates with the corresponding fi...In the constant-stress accelerated life test, estimation issues are discussed for a generalized half-normal distribution under a log-linear life-stress model. The maximum likelihood estimates with the corresponding fixed point type iterative algorithm for unknown parameters are presented, and the least square estimates of the parameters are also proposed. Meanwhile, confidence intervals of model parameters are constructed by using the asymptotic theory and bootstrap technique. Numerical illustration is given to investigate the performance of our methods.展开更多
Estimation of the unknown mean, μ and variance, σ2 of a univariate Gaussian distribution given a single study variable x is considered. We propose an approach that does not require initialization of the sufficient u...Estimation of the unknown mean, μ and variance, σ2 of a univariate Gaussian distribution given a single study variable x is considered. We propose an approach that does not require initialization of the sufficient unknown distribution parameters. The approach is motivated by linearizing the Gaussian distribution through differential techniques, and estimating, μ and σ2 as regression coefficients using the ordinary least squares method. Two simulated datasets on hereditary traits and morphometric analysis of housefly strains are used to evaluate the proposed method (PM), the maximum likelihood estimation (MLE), and the method of moments (MM). The methods are evaluated by re-estimating the required Gaussian parameters on both large and small samples. The root mean squared error (RMSE), mean error (ME), and the standard deviation (SD) are used to assess the accuracy of the PM and MLE;confidence intervals (CIs) are also constructed for the ME estimate. The PM compares well with both the MLE and MM approaches as they all produce estimates whose errors have good asymptotic properties, also small CIs are observed for the ME using the PM and MLE. The PM can be used symbiotically with the MLE to provide initial approximations at the expectation maximization step.展开更多
In this paper,a modified form of the traditional inverse Lomax distribution is proposed and its characteristics are studied.The new distribution which called modified logarithmic transformed inverse Lomax distribution...In this paper,a modified form of the traditional inverse Lomax distribution is proposed and its characteristics are studied.The new distribution which called modified logarithmic transformed inverse Lomax distribution is generated by adding a new shape parameter based on logarithmic transformed method.It contains two shape and one scale parameters and has different shapes of probability density and hazard rate functions.The new shape parameter increases the flexibility of the statistical properties of the traditional inverse Lomax distribution including mean,variance,skewness and kurtosis.The moments,entropies,order statistics and other properties are discussed.Six methods of estimation are considered to estimate the distribution parameters.To compare the performance of the different estimators,a simulation study is performed.To show the flexibility and applicability of the proposed distribution two real data sets to engineering and medical fields are analyzed.The simulation results and real data analysis showed that the Anderson-Darling estimates have the smallest mean square errors among all other estimates.Also,the analysis of the real data sets showed that the traditional inverse Lomax distribution and some of its generalizations have shortcomings in modeling engineering and medical data.Our proposed distribution overcomes this shortage and provides a good fit which makes it a suitable choice to model such data sets.展开更多
In this paper, inference on parameter estimation of the generalized Rayleigh distribution are investigated for progressively type-I interval censored samples. The estimators of distribution parameters via maximum like...In this paper, inference on parameter estimation of the generalized Rayleigh distribution are investigated for progressively type-I interval censored samples. The estimators of distribution parameters via maximum likelihood, moment method and probability plot are derived, and their performance are compared based on simulation results in terms of the mean squared error and bias. A case application of plasma cell myeloma data is used for illustrating the proposed estimation methods.展开更多
In this paper, we present a new approach (Kalman Filter Smoothing) to estimate and forecast survival of Diabetic and Non Diabetic Coronary Artery Bypass Graft Surgery (CABG) patients. Survival proportions of the patie...In this paper, we present a new approach (Kalman Filter Smoothing) to estimate and forecast survival of Diabetic and Non Diabetic Coronary Artery Bypass Graft Surgery (CABG) patients. Survival proportions of the patients are obtained from a lifetime representing parametric model (Weibull distribution with Kalman Filter approach). Moreover, an approach of complete population (CP) from its incomplete population (IP) of the patients with 12 years observations/follow-up is used for their survival analysis [1]. The survival proportions of the CP obtained from Kaplan Meier method are used as observed values yt?at time t (input) for Kalman Filter Smoothing process to update time varying parameters. In case of CP, the term representing censored observations may be dropped from likelihood function of the distribution. Maximum likelihood method, in-conjunction with Davidon-Fletcher-Powell (DFP) optimization method [2] and Cubic Interpolation method is used in estimation of the survivor’s proportions. The estimated and forecasted survival proportions of CP of the Diabetic and Non Diabetic CABG patients from the Kalman Filter Smoothing approach are presented in terms of statistics, survival curves, discussion and conclusion.展开更多
Background: The Poisson and the Negative Binomial distributions are commonly used to model count data. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial has a variance lar...Background: The Poisson and the Negative Binomial distributions are commonly used to model count data. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial has a variance larger than the mean and therefore both models are appropriate to model over-dispersed count data. Objectives: A new two-parameter probability distribution called the Quasi-Negative Binomial Distribution (QNBD) is being studied in this paper, generalizing the well-known negative binomial distribution. This model turns out to be quite flexible for analyzing count data. Our main objectives are to estimate the parameters of the proposed distribution and to discuss its applicability to genetics data. As an application, we demonstrate that the QNBD regression representation is utilized to model genomics data sets. Results: The new distribution is shown to provide a good fit with respect to the “Akaike Information Criterion”, AIC, considered a measure of model goodness of fit. The proposed distribution may serve as a viable alternative to other distributions available in the literature for modeling count data exhibiting overdispersion, arising in various fields of scientific investigation such as genomics and biomedicine.展开更多
文摘Iteration methods and their convergences of the maximum likelihoodestimator are discussed in this paper.We study Gauss-Newton method and give a set ofsufficient conditions for the convergence of asymptotic numerical stability.The modifiedGauss-Newton method is also studied and the sufficient conditions of the convergence arepresented.Two numerical examples are given to illustrate our results.
基金supported by the National Science Foundations (DMS0504783 DMS0604207)National Science Fund for Distinguished Young Scholars of China (70825005)
文摘This paper deals with the problems of consistency and strong consistency of the maximum likelihood estimators of the mean and variance of the drift fractional Brownian motions observed at discrete time instants. Both the central limit theorem and the Berry-Ess′een bounds for these estimators are obtained by using the Stein’s method via Malliavin calculus.
文摘By taking the subsequence out of the input-output sequence of a system polluted by white noise, an independent observation sequence and its probability density are obtained and then a maximum likelihood estimation of the identification parameters is given. In order to decrease the asymptotic error, a corrector of maximum likelihood (CML) estimation with its recursive algorithm is given. It has been proved that the corrector has smaller asymptotic error than the least square methods. A simulation example shows that the corrector of maximum likelihood estimation is of higher approximating precision to the true parameters than the least square methods.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
基金the National Natural Science Foundation of China
文摘According to the principle, “The failure data is the basis of software reliability analysis”, we built a software reliability expert system (SRES) by adopting the artificial intelligence technology. By reasoning out the conclusion from the fitting results of failure data of a software project, the SRES can recommend users “the most suitable model” as a software reliability measurement model. We believe that the SRES can overcome the inconsistency in applications of software reliability models well. We report investigation results of singularity and parameter estimation methods of experimental models in SRES.
基金supported by the National Natural Science Foundation of China(1150143371473187)the Natural Science Basic Research Plan in Shaanxi Province of China(2016JQ1014)
文摘In the constant-stress accelerated life test, estimation issues are discussed for a generalized half-normal distribution under a log-linear life-stress model. The maximum likelihood estimates with the corresponding fixed point type iterative algorithm for unknown parameters are presented, and the least square estimates of the parameters are also proposed. Meanwhile, confidence intervals of model parameters are constructed by using the asymptotic theory and bootstrap technique. Numerical illustration is given to investigate the performance of our methods.
文摘Estimation of the unknown mean, μ and variance, σ2 of a univariate Gaussian distribution given a single study variable x is considered. We propose an approach that does not require initialization of the sufficient unknown distribution parameters. The approach is motivated by linearizing the Gaussian distribution through differential techniques, and estimating, μ and σ2 as regression coefficients using the ordinary least squares method. Two simulated datasets on hereditary traits and morphometric analysis of housefly strains are used to evaluate the proposed method (PM), the maximum likelihood estimation (MLE), and the method of moments (MM). The methods are evaluated by re-estimating the required Gaussian parameters on both large and small samples. The root mean squared error (RMSE), mean error (ME), and the standard deviation (SD) are used to assess the accuracy of the PM and MLE;confidence intervals (CIs) are also constructed for the ME estimate. The PM compares well with both the MLE and MM approaches as they all produce estimates whose errors have good asymptotic properties, also small CIs are observed for the ME using the PM and MLE. The PM can be used symbiotically with the MLE to provide initial approximations at the expectation maximization step.
基金This project was funded by the Deanship Scientific Research(DSR),King Abdulaziz University,Jeddah under Grant No.(RG-14-130-41)The author,therefore,acknowledge with thanks DSR for technical and financial support.
文摘In this paper,a modified form of the traditional inverse Lomax distribution is proposed and its characteristics are studied.The new distribution which called modified logarithmic transformed inverse Lomax distribution is generated by adding a new shape parameter based on logarithmic transformed method.It contains two shape and one scale parameters and has different shapes of probability density and hazard rate functions.The new shape parameter increases the flexibility of the statistical properties of the traditional inverse Lomax distribution including mean,variance,skewness and kurtosis.The moments,entropies,order statistics and other properties are discussed.Six methods of estimation are considered to estimate the distribution parameters.To compare the performance of the different estimators,a simulation study is performed.To show the flexibility and applicability of the proposed distribution two real data sets to engineering and medical fields are analyzed.The simulation results and real data analysis showed that the Anderson-Darling estimates have the smallest mean square errors among all other estimates.Also,the analysis of the real data sets showed that the traditional inverse Lomax distribution and some of its generalizations have shortcomings in modeling engineering and medical data.Our proposed distribution overcomes this shortage and provides a good fit which makes it a suitable choice to model such data sets.
文摘In this paper, inference on parameter estimation of the generalized Rayleigh distribution are investigated for progressively type-I interval censored samples. The estimators of distribution parameters via maximum likelihood, moment method and probability plot are derived, and their performance are compared based on simulation results in terms of the mean squared error and bias. A case application of plasma cell myeloma data is used for illustrating the proposed estimation methods.
文摘In this paper, we present a new approach (Kalman Filter Smoothing) to estimate and forecast survival of Diabetic and Non Diabetic Coronary Artery Bypass Graft Surgery (CABG) patients. Survival proportions of the patients are obtained from a lifetime representing parametric model (Weibull distribution with Kalman Filter approach). Moreover, an approach of complete population (CP) from its incomplete population (IP) of the patients with 12 years observations/follow-up is used for their survival analysis [1]. The survival proportions of the CP obtained from Kaplan Meier method are used as observed values yt?at time t (input) for Kalman Filter Smoothing process to update time varying parameters. In case of CP, the term representing censored observations may be dropped from likelihood function of the distribution. Maximum likelihood method, in-conjunction with Davidon-Fletcher-Powell (DFP) optimization method [2] and Cubic Interpolation method is used in estimation of the survivor’s proportions. The estimated and forecasted survival proportions of CP of the Diabetic and Non Diabetic CABG patients from the Kalman Filter Smoothing approach are presented in terms of statistics, survival curves, discussion and conclusion.
文摘Background: The Poisson and the Negative Binomial distributions are commonly used to model count data. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial has a variance larger than the mean and therefore both models are appropriate to model over-dispersed count data. Objectives: A new two-parameter probability distribution called the Quasi-Negative Binomial Distribution (QNBD) is being studied in this paper, generalizing the well-known negative binomial distribution. This model turns out to be quite flexible for analyzing count data. Our main objectives are to estimate the parameters of the proposed distribution and to discuss its applicability to genetics data. As an application, we demonstrate that the QNBD regression representation is utilized to model genomics data sets. Results: The new distribution is shown to provide a good fit with respect to the “Akaike Information Criterion”, AIC, considered a measure of model goodness of fit. The proposed distribution may serve as a viable alternative to other distributions available in the literature for modeling count data exhibiting overdispersion, arising in various fields of scientific investigation such as genomics and biomedicine.