We study the quasi likelihood equation in Generalized Linear Models(GLM) with adaptive design ∑(i=1)^n xi(yi-h(x'iβ))=0, where yi is a q=vector, and xi is a p×q random matrix. Under some assumptions, i...We study the quasi likelihood equation in Generalized Linear Models(GLM) with adaptive design ∑(i=1)^n xi(yi-h(x'iβ))=0, where yi is a q=vector, and xi is a p×q random matrix. Under some assumptions, it is shown that the Quasi- Likelihood equation for the GLM has a solution which is asymptotic normal.展开更多
Objective: To analyze longitudinal binary data by using generalized linear models. The correlation between repeated measures were considered. The general method for analyzing longitudinal binary data was given. Method...Objective: To analyze longitudinal binary data by using generalized linear models. The correlation between repeated measures were considered. The general method for analyzing longitudinal binary data was given. Methods: Generalized estimating equations (GEE) proposed by Zeger and Liang was used. For sevens covariance structures, one method was given for estimating regression and correlation parameters. Results: Regression and coerelation parameters were estimated simultaneously. A Set of program was finished and an example was illustrated. Conclusion: Longitudinal dsta often occur in medical researches and clinical trials. For solving the problem of correlation between repeated measures, it is necessary to use some special methods to cope with this Kind of data.展开更多
In this paper, the frequency of an earthquake occurrence and magnitude relationship has been modeled with generalized linear models for the set of earthquake data of Nepal. A goodness of fit of a statistical model is ...In this paper, the frequency of an earthquake occurrence and magnitude relationship has been modeled with generalized linear models for the set of earthquake data of Nepal. A goodness of fit of a statistical model is applied for generalized linear models and considering the model selection information criterion, Akaike information criterion and Bayesian information criterion, generalized Poisson regression model has been selected as a suitable model for the study. The objective of this study is to determine the parameters (a and b values), estimate the probability of an earthquake occurrence and its return period using a Poisson regression model and compared with the Gutenberg-Richter model. The study suggests that the probabilities of earthquake occurrences and return periods estimated by both the models are relatively close to each other. The return periods from the generalized Poisson regression model are comparatively smaller than the Gutenberg-Richter model.展开更多
In this article, we propose a generalized empirical likelihood inference for the parametric component in semiparametric generalized partially linear models with longitudinal data. Based on the extended score vector, a...In this article, we propose a generalized empirical likelihood inference for the parametric component in semiparametric generalized partially linear models with longitudinal data. Based on the extended score vector, a generalized empirical likelihood ratios function is defined, which integrates the within-cluster?correlation meanwhile avoids direct estimating the nuisance parameters in the correlation matrix. We show that the proposed statistics are asymptotically?Chi-squared under some suitable conditions, and hence it can be used to construct the confidence region of parameters. In addition, the maximum empirical likelihood estimates of parameters and the corresponding asymptotic normality are obtained. Simulation studies demonstrate the performance of the proposed method.展开更多
Generalized linear mixed models (GLMMs) are typically constructed by incorporating random effects into the linear predictor. The random effects are usually assumed to be normally distributed with mean zero and varianc...Generalized linear mixed models (GLMMs) are typically constructed by incorporating random effects into the linear predictor. The random effects are usually assumed to be normally distributed with mean zero and variance-covariance identity matrix. In this paper, we propose to release random effects to non-normal distributions and discuss how to model the mean and covariance structures in GLMMs simultaneously. Parameter estimation is solved by using Quasi-Monte Carlo (QMC) method through iterative Newton-Raphson (NR) algorithm very well in terms of accuracy and stabilization, which is demonstrated by real binary salamander mating data analysis and simulation studies.展开更多
This article concerded with a semiparametric generalized partial linear model (GPLM) with the type Ⅱ censored data. A sieve maximum likelihood estimator (MLE) is proposed to estimate the parameter component, allo...This article concerded with a semiparametric generalized partial linear model (GPLM) with the type Ⅱ censored data. A sieve maximum likelihood estimator (MLE) is proposed to estimate the parameter component, allowing exploration of the nonlinear relationship between a certain covariate and the response function. Asymptotic properties of the proposed sieve MLEs are discussed. Under some mild conditions, the estimators are shown to be strongly consistent. Moreover, the estimators of the unknown parameters are asymptotically normal and efficient, and the estimator of the nonparametric function has an optimal convergence rate.展开更多
On foe basis of the Kirchoff-Karman hypothses for the nonlinear bending of thin plates, the three kinds of boundary value problems of nonlinear analysis for perforated fhin plates are presented under the differenr in...On foe basis of the Kirchoff-Karman hypothses for the nonlinear bending of thin plates, the three kinds of boundary value problems of nonlinear analysis for perforated fhin plates are presented under the differenr in-plane boundary conditions and the corresponding generalized varialional principles are established. One can see that all mathematical models presented in this paper are completely new ones and differ from the ordinary von Karman theory. These mathematical models can be applied to the nonlinear analysis and the Stability analysis of perforaled thin plates in arbitraryplane boundary conditions.展开更多
Accurate classification and prediction of future traffic conditions are essential for developing effective strategies for congestion mitigation on the highway systems. Speed distribution is one of the traffic stream p...Accurate classification and prediction of future traffic conditions are essential for developing effective strategies for congestion mitigation on the highway systems. Speed distribution is one of the traffic stream parameters, which has been used to quantify the traffic conditions. Previous studies have shown that multi-modal probability distribution of speeds gives excellent results when simultaneously evaluating congested and free-flow traffic conditions. However, most of these previous analytical studies do not incorporate the influencing factors in characterizing these conditions. This study evaluates the impact of traffic occupancy on the multi-state speed distribution using the Bayesian Dirichlet Process Mixtures of Generalized Linear Models (DPM-GLM). Further, the study estimates the speed cut-point values of traffic states, which separate them into homogeneous groups using Bayesian change-point detection (BCD) technique. The study used 2015 archived one-year traffic data collected on Florida’s Interstate 295 freeway corridor. Information criteria results revealed three traffic states, which were identified as free-flow, transitional flow condition (congestion onset/offset), and the congested condition. The findings of the DPM-GLM indicated that in all estimated states, the traffic speed decreases when traffic occupancy increases. Comparison of the influence of traffic occupancy between traffic states showed that traffic occupancy has more impact on the free-flow and the congested state than on the transitional flow condition. With respect to estimating the threshold speed value, the results of the BCD model revealed promising findings in characterizing levels of traffic congestion.展开更多
In this paper, we extend the generalized likelihood ratio test to the varying-coefficient models with censored data. We investigate the asymptotic behavior of the proposed test and demonstrate that its limiting null d...In this paper, we extend the generalized likelihood ratio test to the varying-coefficient models with censored data. We investigate the asymptotic behavior of the proposed test and demonstrate that its limiting null distribution follows a distribution, with the scale constant and the number of degree of freedom being independent of nuisance parameters or functions, which is called the wilks phenomenon. Both simulated and real data examples are given to illustrate the performance of the testing approach.展开更多
Count data that exhibit over dispersion (variance of counts is larger than its mean) are commonly analyzed using discrete distributions such as negative binomial, Poisson inverse Gaussian and other models. The Poisson...Count data that exhibit over dispersion (variance of counts is larger than its mean) are commonly analyzed using discrete distributions such as negative binomial, Poisson inverse Gaussian and other models. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial and the Poisson inverse Gaussian have variance larger than the mean and therefore are more appropriate to model over-dispersed count data. As an alternative to these two models, we shall use the generalized Poisson distribution for group comparisons in the presence of multiple covariates. This problem is known as the ANCOVA and is solved for continuous data. Our objectives were to develop ANCOVA using the generalized Poisson distribution, and compare its goodness of fit to that of the nonparametric Generalized Additive Models. We used real life data to show that the model performs quite satisfactorily when compared to the nonparametric Generalized Additive Models.展开更多
In a linear regression model, testing for uniformity of the variance of the residuals is a significant integral part of statistical analysis. This is a crucial assumption that requires statistical confirmation via the...In a linear regression model, testing for uniformity of the variance of the residuals is a significant integral part of statistical analysis. This is a crucial assumption that requires statistical confirmation via the use of some statistical tests mostly before carrying out the Analysis of Variance (ANOVA) technique. Many academic researchers have published series of papers (articles) on some tests for detecting variance heterogeneity assumption in multiple linear regression models. So many comparisons on these tests have been made using various statistical techniques like biases, error rates as well as powers. Aside comparisons, modifications of some of these statistical tests for detecting variance heterogeneity have been reported in some literatures in recent years. In a multiple linear regression situation, much work has not been done on comparing some selected statistical tests for homoscedasticity assumption when linear, quadratic, square root, and exponential forms of heteroscedasticity are injected into the residuals. As a result of this fact, the present study intends to work extensively on all these areas of interest with a view to filling the gap. The paper aims at providing a comprehensive comparative analysis of asymptotic behaviour of some selected statistical tests for homoscedasticity assumption in order to hunt for the best statistical test for detecting heteroscedasticity in a multiple linear regression scenario with varying variances and levels of significance. In the literature, several tests for homoscedasticity are available but only nine: Breusch-Godfrey test, studentized Breusch-Pagan test, White’s test, Nonconstant Variance Score test, Park test, Spearman Rank, <span>Glejser test, Goldfeld-Quandt test, Harrison-McCabe test were considered for this study;this is with a view to examining, by Monte Carlo simulations, their</span><span> asymptotic behaviours. However, four different forms of heteroscedastic structures: exponential and linear (generalize of square-root and quadratic structures) were injected into the residual part of the multiple linear regression models at different categories of sample sizes: 30, 50, 100, 200, 500 and 1000. Evaluations of the performances were done within R environment. Among other findings, our investigations revealed that Glejser and Park tests returned the best test to employ to check for heteroscedasticity in EHS and LHS respectively also White and Harrison-McCabe tests returned the best test to employ to check for homoscedasticity in EHS and LHS respectively for sample size less than 50.</span>展开更多
Changes in climate factors such as temperature, rainfall, humidity, and wind speed are natural processes that could significantly impact the incidence of infectious diseases. Dengue is a widespread disease that has of...Changes in climate factors such as temperature, rainfall, humidity, and wind speed are natural processes that could significantly impact the incidence of infectious diseases. Dengue is a widespread disease that has often been documented when it comes to the impact of climate change. It has become a significant concern, especially for the Malaysian health authorities, due to its rapid spread and serious effects, leading to loss of life. Several statistical models were performed to identify climatic factors associated with infectious diseases. However, because of the complex and nonlinear interactions between climate variables and disease components, modelling their relationships have become the main challenge in climate-health studies. Hence, this study proposed a Generalized Linear Model (GLM) via Poisson and Negative Binomial to examine the effects of the climate factors on dengue incidence by considering the collinearity between variables. This study focuses on the dengue hot spots in Malaysia for the year 2014. Since there exists collinearity between climate factors, the analysis was done separately using three different models. The study revealed that rainfall, temperature, humidity, and wind speed were statistically significant with dengue incidence, and most of them shown a negative effect. Of all variables, wind speed has the most significant impact on dengue incidence. Having this kind of relationships, policymakers should formulate better plans such that precautionary steps can be taken to reduce the spread of dengue diseases.展开更多
In this paper, we define a new class of biased linear estimators of the vector of unknown parameters in the deficient_rank linear model based on the spectral decomposition expression of the best linear minimun bias es...In this paper, we define a new class of biased linear estimators of the vector of unknown parameters in the deficient_rank linear model based on the spectral decomposition expression of the best linear minimun bias estimator. Some important properties are discussed. By appropriate choices of bias parameters, we construct many interested and useful biased linear estimators, which are the extension of ordinary biased linear estimators in the full_rank linear model to the deficient_rank linear model. At last, we give a numerical example in geodetic adjustment.展开更多
In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining ...In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.展开更多
The nonlinear stability of the three-layer generalized Phillips model, for which the velocity in each layeris constant and the top and bottom surfaces are either rigid or free, is studied by employing Arnol'd'...The nonlinear stability of the three-layer generalized Phillips model, for which the velocity in each layeris constant and the top and bottom surfaces are either rigid or free, is studied by employing Arnol'd'svariational principle and a prior estimate method. The nonlinear stability criteria are established. For comparison, the linear instability criteria are also obtained by using normal mode method. and the influences ofthe free parameter, β parameter and curvature in vertical profile of the horizontal velocity on the linear instability are discussed by use of the growth rate curves. The comparison between the nonlinear stability criterion and the linear one is made. It is shown that insome cases the two criteria are exactly the same in form, but in other cases, they are different. This phenomenon, which reveals the nonlinear property of the linear instability features. is explained by the explosiveresonant interaction (ERI). When there exists the ERI, i.e., the nonlinear mechanisms play a leading role inthe dynamical system. the nonlinear stability criterion is different from the linear one, on the other hand.when there does not exist the ERI. the nonlinear stability criterion is the same as the linear one in form.展开更多
In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived ...In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived from two quite different perspectives. Here, settling on the most commonly accepted definition of the MSPE as the expectation of the squared prediction error loss, we provide theoretical expressions for it, valid for any linear model (LM) fitter, be it under random or non random designs. Specializing these MSPE expressions for each of them, we are able to derive closed formulas of the MSPE for some of the most popular LM fitters: Ordinary Least Squares (OLS), with or without a full column rank design matrix;Ordinary and Generalized Ridge regression, the latter embedding smoothing splines fitting. For each of these LM fitters, we then deduce a computable estimate of the MSPE which turns out to coincide with Akaike’s FPE. Using a slight variation, we similarly get a class of MSPE estimates coinciding with the classical GCV formula for those same LM fitters.展开更多
The penalized variable selection methods are often used to select the relevant covariates and estimate the unknown regression coefficients simultaneously,but these existing methods may fail to be consistent for the se...The penalized variable selection methods are often used to select the relevant covariates and estimate the unknown regression coefficients simultaneously,but these existing methods may fail to be consistent for the setting with highly correlated covariates.In this paper,the semi-standard partial covariance(SPAC)method with Lasso penalty is proposed to study the generalized linear model with highly correlated covariates,and the consistencies of the estimation and variable selection are shown in high-dimensional settings under some regularity conditions.Some simulation studies and an analysis of colon tumor dataset are carried out to show that the proposed method performs better in addressing highly correlated problem than the traditional penalized variable selection methods.展开更多
This paper provides further contributions to the theory of linear sufficiency in the general Gauss-Markov model E(y)=Xβ,Var (y)=V.The notion of linear sufficiency introduced by Baksalary and Kala(1981) and Drygas(198...This paper provides further contributions to the theory of linear sufficiency in the general Gauss-Markov model E(y)=Xβ,Var (y)=V.The notion of linear sufficiency introduced by Baksalary and Kala(1981) and Drygas(1983) is extended for any specific estimable function c′β.Some general results with respect to the extended concept are obtained.An essential result concerning the former notion is a direct consequence of this paper.展开更多
General linear model (GLM) is the most popular method for functional magnetic resource imaging (fMRI) data analysis . However, its theory is imperfect. The key of this model is how to constitute the design-matrix to m...General linear model (GLM) is the most popular method for functional magnetic resource imaging (fMRI) data analysis . However, its theory is imperfect. The key of this model is how to constitute the design-matrix to model the interesting effects better and separate noises better. For the purpose of detecting brain function activation , according to the principle of GLM,a new convolution model is presented by a new dynamic function convolving with design-matrix,which combining with t-test can be used to detect brain active signal. The fMRI imaging result of visual stimulus experiment indicates that brain activities mainly concentrate among v1and v2 areas of visual cortex, and also verified the validity of this technique.展开更多
We propose the test statistic to check whether the nonpararnetric functions in two partially linear models are equality or not in this paper. We estimate the nonparametric function both in null hypothesis and the alte...We propose the test statistic to check whether the nonpararnetric functions in two partially linear models are equality or not in this paper. We estimate the nonparametric function both in null hypothesis and the alternative by the local linear method, where we ignore the parametric components, and then estimate the parameters by the two stage method. The test statistic is derived, and it is shown to be asymptotically normal under the null hypothesis.展开更多
文摘We study the quasi likelihood equation in Generalized Linear Models(GLM) with adaptive design ∑(i=1)^n xi(yi-h(x'iβ))=0, where yi is a q=vector, and xi is a p×q random matrix. Under some assumptions, it is shown that the Quasi- Likelihood equation for the GLM has a solution which is asymptotic normal.
文摘Objective: To analyze longitudinal binary data by using generalized linear models. The correlation between repeated measures were considered. The general method for analyzing longitudinal binary data was given. Methods: Generalized estimating equations (GEE) proposed by Zeger and Liang was used. For sevens covariance structures, one method was given for estimating regression and correlation parameters. Results: Regression and coerelation parameters were estimated simultaneously. A Set of program was finished and an example was illustrated. Conclusion: Longitudinal dsta often occur in medical researches and clinical trials. For solving the problem of correlation between repeated measures, it is necessary to use some special methods to cope with this Kind of data.
文摘In this paper, the frequency of an earthquake occurrence and magnitude relationship has been modeled with generalized linear models for the set of earthquake data of Nepal. A goodness of fit of a statistical model is applied for generalized linear models and considering the model selection information criterion, Akaike information criterion and Bayesian information criterion, generalized Poisson regression model has been selected as a suitable model for the study. The objective of this study is to determine the parameters (a and b values), estimate the probability of an earthquake occurrence and its return period using a Poisson regression model and compared with the Gutenberg-Richter model. The study suggests that the probabilities of earthquake occurrences and return periods estimated by both the models are relatively close to each other. The return periods from the generalized Poisson regression model are comparatively smaller than the Gutenberg-Richter model.
文摘In this article, we propose a generalized empirical likelihood inference for the parametric component in semiparametric generalized partially linear models with longitudinal data. Based on the extended score vector, a generalized empirical likelihood ratios function is defined, which integrates the within-cluster?correlation meanwhile avoids direct estimating the nuisance parameters in the correlation matrix. We show that the proposed statistics are asymptotically?Chi-squared under some suitable conditions, and hence it can be used to construct the confidence region of parameters. In addition, the maximum empirical likelihood estimates of parameters and the corresponding asymptotic normality are obtained. Simulation studies demonstrate the performance of the proposed method.
文摘Generalized linear mixed models (GLMMs) are typically constructed by incorporating random effects into the linear predictor. The random effects are usually assumed to be normally distributed with mean zero and variance-covariance identity matrix. In this paper, we propose to release random effects to non-normal distributions and discuss how to model the mean and covariance structures in GLMMs simultaneously. Parameter estimation is solved by using Quasi-Monte Carlo (QMC) method through iterative Newton-Raphson (NR) algorithm very well in terms of accuracy and stabilization, which is demonstrated by real binary salamander mating data analysis and simulation studies.
基金The talent research fund launched (3004-893325) of Dalian University of Technologythe NNSF (10271049) of China.
文摘This article concerded with a semiparametric generalized partial linear model (GPLM) with the type Ⅱ censored data. A sieve maximum likelihood estimator (MLE) is proposed to estimate the parameter component, allowing exploration of the nonlinear relationship between a certain covariate and the response function. Asymptotic properties of the proposed sieve MLEs are discussed. Under some mild conditions, the estimators are shown to be strongly consistent. Moreover, the estimators of the unknown parameters are asymptotically normal and efficient, and the estimator of the nonparametric function has an optimal convergence rate.
文摘On foe basis of the Kirchoff-Karman hypothses for the nonlinear bending of thin plates, the three kinds of boundary value problems of nonlinear analysis for perforated fhin plates are presented under the differenr in-plane boundary conditions and the corresponding generalized varialional principles are established. One can see that all mathematical models presented in this paper are completely new ones and differ from the ordinary von Karman theory. These mathematical models can be applied to the nonlinear analysis and the Stability analysis of perforaled thin plates in arbitraryplane boundary conditions.
文摘Accurate classification and prediction of future traffic conditions are essential for developing effective strategies for congestion mitigation on the highway systems. Speed distribution is one of the traffic stream parameters, which has been used to quantify the traffic conditions. Previous studies have shown that multi-modal probability distribution of speeds gives excellent results when simultaneously evaluating congested and free-flow traffic conditions. However, most of these previous analytical studies do not incorporate the influencing factors in characterizing these conditions. This study evaluates the impact of traffic occupancy on the multi-state speed distribution using the Bayesian Dirichlet Process Mixtures of Generalized Linear Models (DPM-GLM). Further, the study estimates the speed cut-point values of traffic states, which separate them into homogeneous groups using Bayesian change-point detection (BCD) technique. The study used 2015 archived one-year traffic data collected on Florida’s Interstate 295 freeway corridor. Information criteria results revealed three traffic states, which were identified as free-flow, transitional flow condition (congestion onset/offset), and the congested condition. The findings of the DPM-GLM indicated that in all estimated states, the traffic speed decreases when traffic occupancy increases. Comparison of the influence of traffic occupancy between traffic states showed that traffic occupancy has more impact on the free-flow and the congested state than on the transitional flow condition. With respect to estimating the threshold speed value, the results of the BCD model revealed promising findings in characterizing levels of traffic congestion.
文摘In this paper, we extend the generalized likelihood ratio test to the varying-coefficient models with censored data. We investigate the asymptotic behavior of the proposed test and demonstrate that its limiting null distribution follows a distribution, with the scale constant and the number of degree of freedom being independent of nuisance parameters or functions, which is called the wilks phenomenon. Both simulated and real data examples are given to illustrate the performance of the testing approach.
文摘Count data that exhibit over dispersion (variance of counts is larger than its mean) are commonly analyzed using discrete distributions such as negative binomial, Poisson inverse Gaussian and other models. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial and the Poisson inverse Gaussian have variance larger than the mean and therefore are more appropriate to model over-dispersed count data. As an alternative to these two models, we shall use the generalized Poisson distribution for group comparisons in the presence of multiple covariates. This problem is known as the ANCOVA and is solved for continuous data. Our objectives were to develop ANCOVA using the generalized Poisson distribution, and compare its goodness of fit to that of the nonparametric Generalized Additive Models. We used real life data to show that the model performs quite satisfactorily when compared to the nonparametric Generalized Additive Models.
文摘In a linear regression model, testing for uniformity of the variance of the residuals is a significant integral part of statistical analysis. This is a crucial assumption that requires statistical confirmation via the use of some statistical tests mostly before carrying out the Analysis of Variance (ANOVA) technique. Many academic researchers have published series of papers (articles) on some tests for detecting variance heterogeneity assumption in multiple linear regression models. So many comparisons on these tests have been made using various statistical techniques like biases, error rates as well as powers. Aside comparisons, modifications of some of these statistical tests for detecting variance heterogeneity have been reported in some literatures in recent years. In a multiple linear regression situation, much work has not been done on comparing some selected statistical tests for homoscedasticity assumption when linear, quadratic, square root, and exponential forms of heteroscedasticity are injected into the residuals. As a result of this fact, the present study intends to work extensively on all these areas of interest with a view to filling the gap. The paper aims at providing a comprehensive comparative analysis of asymptotic behaviour of some selected statistical tests for homoscedasticity assumption in order to hunt for the best statistical test for detecting heteroscedasticity in a multiple linear regression scenario with varying variances and levels of significance. In the literature, several tests for homoscedasticity are available but only nine: Breusch-Godfrey test, studentized Breusch-Pagan test, White’s test, Nonconstant Variance Score test, Park test, Spearman Rank, <span>Glejser test, Goldfeld-Quandt test, Harrison-McCabe test were considered for this study;this is with a view to examining, by Monte Carlo simulations, their</span><span> asymptotic behaviours. However, four different forms of heteroscedastic structures: exponential and linear (generalize of square-root and quadratic structures) were injected into the residual part of the multiple linear regression models at different categories of sample sizes: 30, 50, 100, 200, 500 and 1000. Evaluations of the performances were done within R environment. Among other findings, our investigations revealed that Glejser and Park tests returned the best test to employ to check for heteroscedasticity in EHS and LHS respectively also White and Harrison-McCabe tests returned the best test to employ to check for homoscedasticity in EHS and LHS respectively for sample size less than 50.</span>
文摘Changes in climate factors such as temperature, rainfall, humidity, and wind speed are natural processes that could significantly impact the incidence of infectious diseases. Dengue is a widespread disease that has often been documented when it comes to the impact of climate change. It has become a significant concern, especially for the Malaysian health authorities, due to its rapid spread and serious effects, leading to loss of life. Several statistical models were performed to identify climatic factors associated with infectious diseases. However, because of the complex and nonlinear interactions between climate variables and disease components, modelling their relationships have become the main challenge in climate-health studies. Hence, this study proposed a Generalized Linear Model (GLM) via Poisson and Negative Binomial to examine the effects of the climate factors on dengue incidence by considering the collinearity between variables. This study focuses on the dengue hot spots in Malaysia for the year 2014. Since there exists collinearity between climate factors, the analysis was done separately using three different models. The study revealed that rainfall, temperature, humidity, and wind speed were statistically significant with dengue incidence, and most of them shown a negative effect. Of all variables, wind speed has the most significant impact on dengue incidence. Having this kind of relationships, policymakers should formulate better plans such that precautionary steps can be taken to reduce the spread of dengue diseases.
文摘In this paper, we define a new class of biased linear estimators of the vector of unknown parameters in the deficient_rank linear model based on the spectral decomposition expression of the best linear minimun bias estimator. Some important properties are discussed. By appropriate choices of bias parameters, we construct many interested and useful biased linear estimators, which are the extension of ordinary biased linear estimators in the full_rank linear model to the deficient_rank linear model. At last, we give a numerical example in geodetic adjustment.
基金This research was funded by the National Natural Science Foundation of China(No.62272124)the National Key Research and Development Program of China(No.2022YFB2701401)+3 种基金Guizhou Province Science and Technology Plan Project(Grant Nos.Qiankehe Paltform Talent[2020]5017)The Research Project of Guizhou University for Talent Introduction(No.[2020]61)the Cultivation Project of Guizhou University(No.[2019]56)the Open Fund of Key Laboratory of Advanced Manufacturing Technology,Ministry of Education(GZUAMT2021KF[01]).
文摘In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.
文摘The nonlinear stability of the three-layer generalized Phillips model, for which the velocity in each layeris constant and the top and bottom surfaces are either rigid or free, is studied by employing Arnol'd'svariational principle and a prior estimate method. The nonlinear stability criteria are established. For comparison, the linear instability criteria are also obtained by using normal mode method. and the influences ofthe free parameter, β parameter and curvature in vertical profile of the horizontal velocity on the linear instability are discussed by use of the growth rate curves. The comparison between the nonlinear stability criterion and the linear one is made. It is shown that insome cases the two criteria are exactly the same in form, but in other cases, they are different. This phenomenon, which reveals the nonlinear property of the linear instability features. is explained by the explosiveresonant interaction (ERI). When there exists the ERI, i.e., the nonlinear mechanisms play a leading role inthe dynamical system. the nonlinear stability criterion is different from the linear one, on the other hand.when there does not exist the ERI. the nonlinear stability criterion is the same as the linear one in form.
文摘In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived from two quite different perspectives. Here, settling on the most commonly accepted definition of the MSPE as the expectation of the squared prediction error loss, we provide theoretical expressions for it, valid for any linear model (LM) fitter, be it under random or non random designs. Specializing these MSPE expressions for each of them, we are able to derive closed formulas of the MSPE for some of the most popular LM fitters: Ordinary Least Squares (OLS), with or without a full column rank design matrix;Ordinary and Generalized Ridge regression, the latter embedding smoothing splines fitting. For each of these LM fitters, we then deduce a computable estimate of the MSPE which turns out to coincide with Akaike’s FPE. Using a slight variation, we similarly get a class of MSPE estimates coinciding with the classical GCV formula for those same LM fitters.
基金Supported by the National Natural Science Foundation of China(Grant Nos.12001277,12271046 and 12131006)。
文摘The penalized variable selection methods are often used to select the relevant covariates and estimate the unknown regression coefficients simultaneously,but these existing methods may fail to be consistent for the setting with highly correlated covariates.In this paper,the semi-standard partial covariance(SPAC)method with Lasso penalty is proposed to study the generalized linear model with highly correlated covariates,and the consistencies of the estimation and variable selection are shown in high-dimensional settings under some regularity conditions.Some simulation studies and an analysis of colon tumor dataset are carried out to show that the proposed method performs better in addressing highly correlated problem than the traditional penalized variable selection methods.
基金the Natural Science Foundation of Guangdong Province(0 1 0 4 86 )
文摘This paper provides further contributions to the theory of linear sufficiency in the general Gauss-Markov model E(y)=Xβ,Var (y)=V.The notion of linear sufficiency introduced by Baksalary and Kala(1981) and Drygas(1983) is extended for any specific estimable function c′β.Some general results with respect to the extended concept are obtained.An essential result concerning the former notion is a direct consequence of this paper.
基金Supported by National Natural Science Foundation of China (No.90208003, 30200059), the 973 Project (No. 2003CB716106), Doctor training Fund of MOE, P.R.C., and Fok Ying Tong Education Foundation (No.91041)
文摘General linear model (GLM) is the most popular method for functional magnetic resource imaging (fMRI) data analysis . However, its theory is imperfect. The key of this model is how to constitute the design-matrix to model the interesting effects better and separate noises better. For the purpose of detecting brain function activation , according to the principle of GLM,a new convolution model is presented by a new dynamic function convolving with design-matrix,which combining with t-test can be used to detect brain active signal. The fMRI imaging result of visual stimulus experiment indicates that brain activities mainly concentrate among v1and v2 areas of visual cortex, and also verified the validity of this technique.
文摘We propose the test statistic to check whether the nonpararnetric functions in two partially linear models are equality or not in this paper. We estimate the nonparametric function both in null hypothesis and the alternative by the local linear method, where we ignore the parametric components, and then estimate the parameters by the two stage method. The test statistic is derived, and it is shown to be asymptotically normal under the null hypothesis.