In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived ...In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived from two quite different perspectives. Here, settling on the most commonly accepted definition of the MSPE as the expectation of the squared prediction error loss, we provide theoretical expressions for it, valid for any linear model (LM) fitter, be it under random or non random designs. Specializing these MSPE expressions for each of them, we are able to derive closed formulas of the MSPE for some of the most popular LM fitters: Ordinary Least Squares (OLS), with or without a full column rank design matrix;Ordinary and Generalized Ridge regression, the latter embedding smoothing splines fitting. For each of these LM fitters, we then deduce a computable estimate of the MSPE which turns out to coincide with Akaike’s FPE. Using a slight variation, we similarly get a class of MSPE estimates coinciding with the classical GCV formula for those same LM fitters.展开更多
In classical regression analysis, the error of independent variable is usually not taken into account in regression analysis. This paper presents two solution methods for the case that both the independent and the dep...In classical regression analysis, the error of independent variable is usually not taken into account in regression analysis. This paper presents two solution methods for the case that both the independent and the dependent variables have errors. These methods are derived from the condition-adjustment and indirect-adjustment models based on the Total-Least-Squares principle. The equivalence of these two methods is also proven in theory.展开更多
In this paper,the authors consider a sparse parameter estimation problem in continuoustime linear stochastic regression models using sampling data.Based on the compressed sensing(CS)method,the authors propose a compre...In this paper,the authors consider a sparse parameter estimation problem in continuoustime linear stochastic regression models using sampling data.Based on the compressed sensing(CS)method,the authors propose a compressed least squares(LS) algorithm to deal with the challenges of parameter sparsity.At each sampling time instant,the proposed compressed LS algorithm first compresses the original high-dimensional regressor using a sensing matrix and obtains a low-dimensional LS estimate for the compressed unknown parameter.Then,the original high-dimensional sparse unknown parameter is recovered by a reconstruction method.By introducing a compressed excitation assumption and employing stochastic Lyapunov function and martingale estimate methods,the authors establish the performance analysis of the compressed LS algorithm under the condition on the sampling time interval without using independence or stationarity conditions on the system signals.At last,a simulation example is provided to verify the theoretical results by comparing the standard and the compressed LS algorithms for estimating a high-dimensional sparse unknown parameter.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
In this paper, the performance of existing biased estimators (Ridge Estimator (RE), Almost Unbiased Ridge Estimator (AURE), Liu Estimator (LE), Almost Unbiased Liu Estimator (AULE), Principal Component Regression Esti...In this paper, the performance of existing biased estimators (Ridge Estimator (RE), Almost Unbiased Ridge Estimator (AURE), Liu Estimator (LE), Almost Unbiased Liu Estimator (AULE), Principal Component Regression Estimator (PCRE), r-k class estimator and r-d class estimator) and the respective predictors were considered in a misspecified linear regression model when there exists multicollinearity among explanatory variables. A generalized form was used to compare these estimators and predictors in the mean square error sense. Further, theoretical findings were established using mean square error matrix and scalar mean square error. Finally, a numerical example and a Monte Carlo simulation study were done to illustrate the theoretical findings. The simulation study revealed that LE and RE outperform the other estimators when weak multicollinearity exists, and RE, r-k class and r-d class estimators outperform the other estimators when moderated and high multicollinearity exist for certain values of shrinkage parameters, respectively. The predictors based on the LE and RE are always superior to the other predictors for certain values of shrinkage parameters.展开更多
The objective of modelling from data is not that the model simply fits the training data well. Rather, the goodness of a model is characterized by its generalization capability, interpretability and ease for knowledge...The objective of modelling from data is not that the model simply fits the training data well. Rather, the goodness of a model is characterized by its generalization capability, interpretability and ease for knowledge extraction. All these desired properties depend crucially on the ability to construct appropriate parsimonious models by the modelling process, and a basic principle in practical nonlinear data modelling is the parsimonious principle of ensuring the smallest possible model that explains the training data. There exists a vast amount of works in the area of sparse modelling, and a widely adopted approach is based on the linear-in-the-parameters data modelling that include the radial basis function network, the neurofuzzy network and all the sparse kernel modelling techniques. A well tested strategy for parsimonious modelling from data is the orthogonal least squares (OLS) algorithm for forward selection modelling, which is capable of constructing sparse models that generalise well. This contribution continues this theme and provides a unified framework for sparse modelling from data that includes regression and classification, which belong to supervised learning, and probability density function estimation, which is an unsupervised learning problem. The OLS forward selection method based on the leave-one-out test criteria is presented within this unified data-modelling framework. Examples from regression, classification and density estimation applications are used to illustrate the effectiveness of this generic parsimonious modelling approach from data.展开更多
Recursive algorithms are very useful for computing M-estimators of regression coefficients and scatter parameters. In this article, it is shown that for a nondecreasing ul (t), under some mild conditions the recursi...Recursive algorithms are very useful for computing M-estimators of regression coefficients and scatter parameters. In this article, it is shown that for a nondecreasing ul (t), under some mild conditions the recursive M-estimators of regression coefficients and scatter parameters are strongly consistent and the recursive M-estimator of the regression coefficients is also asymptotically normal distributed. Furthermore, optimal recursive M-estimators, asymptotic efficiencies of recursive M-estimators and asymptotic relative efficiencies between recursive M-estimators of regression coefficients are studied.展开更多
Multinomial logistic regression (MNL) is an attractive statistical approach in modeling the vehicle crash severity as it does not require the assumption of normality, linearity, or homoscedasticity compared to other a...Multinomial logistic regression (MNL) is an attractive statistical approach in modeling the vehicle crash severity as it does not require the assumption of normality, linearity, or homoscedasticity compared to other approaches, such as the discriminant analysis which requires these assumptions to be met. Moreover, it produces sound estimates by changing the probability range between 0.0 and 1.0 to log odds ranging from negative infinity to positive infinity, as it applies transformation of the dependent variable to a continuous variable. The estimates are asymptotically consistent with the requirements of the nonlinear regression process. The results of MNL can be interpreted by both the regression coefficient estimates and/or the odd ratios (the exponentiated coefficients) as well. In addition, the MNL can be used to improve the fitted model by comparing the full model that includes all predictors to a chosen restricted model by excluding the non-significant predictors. As such, this paper presents a detailed step by step overview of incorporating the MNL in crash severity modeling, using vehicle crash data of the Interstate I70 in the State of Missouri, USA for the years (2013-2015).展开更多
In this paper, we consider the following semipaxametric regression model under fixed design: yi = xi′β+g(xi)+ei. The estimators of β, g(·) and σ^2 axe obtained by using the least squares and usual nonp...In this paper, we consider the following semipaxametric regression model under fixed design: yi = xi′β+g(xi)+ei. The estimators of β, g(·) and σ^2 axe obtained by using the least squares and usual nonparametric weight function method and their strong consistency is proved under the suitable conditions.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
Municipal solid waste generation is strongly linked to rising human population and expanding urban areas, with significant implications on urban metabolism as well as space and place values redefinition. Effective man...Municipal solid waste generation is strongly linked to rising human population and expanding urban areas, with significant implications on urban metabolism as well as space and place values redefinition. Effective management performance of municipal solid waste management underscores the interdisciplinarity strategies. Such knowledge and skills are paramount to uncover the sources of waste generation as well as means of waste storage, collection, recycling, transportation, handling/treatment, disposal, and monitoring. This study was conducted in Dar es Salaam city. Driven by the curiosity model of the solid waste minimization performance at source, study data was collected using focus group discussion techniques to ward-level local government officers, which was triangulated with literature and documentary review. The main themes of the FGD were situational factors (SFA) and local government by-laws (LGBY). In the FGD session, sub-themes of SFA tricked to understand how MSW minimization is related to the presence and effect of services such as land use planning, availability of landfills, solid waste transfer stations, material recovery facilities, incinerators, solid waste collection bins, solid waste trucks, solid waste management budget and solid waste collection agents. Similarly, FGD on LGBY was extended by sub-themes such as contents of the by-law, community awareness of the by-law, and by-law enforcement mechanisms. While data preparation applied an analytical hierarchy process, data analysis applied an ordinary least square (OLS) regression model for sub-criteria that explain SFA and LGBY;and OLS standard residues as variables into geographically weighted regression with a resolution of 241 × 241 meter in ArcMap v10.5. Results showed that situational factors and local government by-laws have a strong relationship with the rate of minimizing solid waste dumping in water bodies (local R square = 0.94).展开更多
In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to...In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to be estimated,random errorsε_(i)are(α,β)-mix_(i)ng random variables.The p-th(p>1)mean consistency,strong consistency and complete consistency for least squares estimators ofβ^(*)and g(·)are investigated under some mild conditions.In addition,a numerical simulation is carried out to study the finite sample performance of the theoretical results.Finally,a real data analysis is provided to further verify the effect of the model.展开更多
Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global cir...Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global circulation model output (GCM). The objectives of this research were to determine the time lag of GCM data and build SD model using PCR method with time lag of the GCM precipitation data. The observations of rainfall data in Indramayu were taken from 1979 to 2007 showing similar patterns with GCM data on 1st grid to 64th grid after time shift (time lag). The time lag was determined using the cross-correlation function. However, GCM data of 64 grids showed multicollinearity problem. This problem was solved by principal component regression (PCR), but the PCR model resulted heterogeneous errors. PCR model was modified to overcome the errors with adding dummy variables to the model. Dummy variables were determined based on partial least squares regression (PLSR). The PCR model with dummy variables improved the rainfall prediction. The SD model with lag-GCM predictors was also better than SD model without lag-GCM.展开更多
The objective of this work is to model statistically the ultraviolet radiation index (UV Index) to make forecast (extrapolate) and analyze trends. The task is relevant, due to increased UV flux and high rate of cases ...The objective of this work is to model statistically the ultraviolet radiation index (UV Index) to make forecast (extrapolate) and analyze trends. The task is relevant, due to increased UV flux and high rate of cases non-melanoma skin cancer in northeast of Brazil. The methodology utilized an Autoregressive Distributed Lag model (ADL) or Dynamic Linear Regression model. The monthly data of UV index were measured in east coast of the Brazilian Northeast (City of Natal-Rio Grande do Norte). The Total Ozone is single explanatory variable to model and was obtained from the TOMS and OMI/AURA instruments. The Predictive Mean Matching (PMM) method was used to complete the missing data of UV Index. The results mean squared error (MSE) between the observed UV index and interpolated data by model was of 0.36 and for extrapolation was of 0.30 with correlations of 0.90 and 0.91 respectively. The forecast/extrapolation performed by model for a climatological period (2012-2042) indicated a trend of increased UV (Seasonal Man-Kendall test scored τ = 0.955 and p-value 0.001) if the Total Ozone remain on this tendency to reduce. In those circumstances, the model indicated an increase of almost one unit of UV index to year 2042.展开更多
Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed dat...Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.展开更多
We construct a fuzzy varying coefficient bilinear regression model to deal with the interval financial data and then adopt the least-squares method based on symmetric fuzzy number space. Firstly, we propose a varying ...We construct a fuzzy varying coefficient bilinear regression model to deal with the interval financial data and then adopt the least-squares method based on symmetric fuzzy number space. Firstly, we propose a varying coefficient model on the basis of the fuzzy bilinear regression model. Secondly, we develop the least-squares method according to the complete distance between fuzzy numbers to estimate the coefficients and test the adaptability of the proposed model by means of generalized likelihood ratio test with SSE composite index. Finally, mean square errors and mean absolutely errors are employed to evaluate and compare the fitting of fuzzy auto regression, fuzzy bilinear regression and fuzzy varying coefficient bilinear regression models, and also the forecasting of three models. Empirical analysis turns out that the proposed model has good fitting and forecasting accuracy with regard to other regression models for the capital market.展开更多
We study the parameter estimation in a nonlinear regression model with a general error's structure,strong consistency and strong consistency rate of the least squares estimator are obtained.
Consider a repeated measurement partially linear regression model with anunknown vector parameter β_1, an unknown function g(·), and unknown heteroscedastic errorvariances. In order to improve the semiparametric...Consider a repeated measurement partially linear regression model with anunknown vector parameter β_1, an unknown function g(·), and unknown heteroscedastic errorvariances. In order to improve the semiparametric generalized least squares estimator (SGLSE) of ,we propose an iterative weighted semiparametric least squares estimator (IWSLSE) and show that itimproves upon the SGLSE in terms of asymptotic covariance matrix. An adaptive procedure is given todetermine the number of iterations. We also show that when the number of replicates is less than orequal to two, the IWSLSE can not improve upon the SGLSE. These results are generalizations of thosein [2] to the case of semiparametric regressions.展开更多
文摘In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived from two quite different perspectives. Here, settling on the most commonly accepted definition of the MSPE as the expectation of the squared prediction error loss, we provide theoretical expressions for it, valid for any linear model (LM) fitter, be it under random or non random designs. Specializing these MSPE expressions for each of them, we are able to derive closed formulas of the MSPE for some of the most popular LM fitters: Ordinary Least Squares (OLS), with or without a full column rank design matrix;Ordinary and Generalized Ridge regression, the latter embedding smoothing splines fitting. For each of these LM fitters, we then deduce a computable estimate of the MSPE which turns out to coincide with Akaike’s FPE. Using a slight variation, we similarly get a class of MSPE estimates coinciding with the classical GCV formula for those same LM fitters.
基金supported by the National Nature Science Foundation of China (41174009)
文摘In classical regression analysis, the error of independent variable is usually not taken into account in regression analysis. This paper presents two solution methods for the case that both the independent and the dependent variables have errors. These methods are derived from the condition-adjustment and indirect-adjustment models based on the Total-Least-Squares principle. The equivalence of these two methods is also proven in theory.
基金supported by the Major Key Project of Peng Cheng Laboratory under Grant No.PCL2023AS1-2Project funded by China Postdoctoral Science Foundation under Grant Nos.2022M722926 and2023T160605。
文摘In this paper,the authors consider a sparse parameter estimation problem in continuoustime linear stochastic regression models using sampling data.Based on the compressed sensing(CS)method,the authors propose a compressed least squares(LS) algorithm to deal with the challenges of parameter sparsity.At each sampling time instant,the proposed compressed LS algorithm first compresses the original high-dimensional regressor using a sensing matrix and obtains a low-dimensional LS estimate for the compressed unknown parameter.Then,the original high-dimensional sparse unknown parameter is recovered by a reconstruction method.By introducing a compressed excitation assumption and employing stochastic Lyapunov function and martingale estimate methods,the authors establish the performance analysis of the compressed LS algorithm under the condition on the sampling time interval without using independence or stationarity conditions on the system signals.At last,a simulation example is provided to verify the theoretical results by comparing the standard and the compressed LS algorithms for estimating a high-dimensional sparse unknown parameter.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
文摘In this paper, the performance of existing biased estimators (Ridge Estimator (RE), Almost Unbiased Ridge Estimator (AURE), Liu Estimator (LE), Almost Unbiased Liu Estimator (AULE), Principal Component Regression Estimator (PCRE), r-k class estimator and r-d class estimator) and the respective predictors were considered in a misspecified linear regression model when there exists multicollinearity among explanatory variables. A generalized form was used to compare these estimators and predictors in the mean square error sense. Further, theoretical findings were established using mean square error matrix and scalar mean square error. Finally, a numerical example and a Monte Carlo simulation study were done to illustrate the theoretical findings. The simulation study revealed that LE and RE outperform the other estimators when weak multicollinearity exists, and RE, r-k class and r-d class estimators outperform the other estimators when moderated and high multicollinearity exist for certain values of shrinkage parameters, respectively. The predictors based on the LE and RE are always superior to the other predictors for certain values of shrinkage parameters.
文摘The objective of modelling from data is not that the model simply fits the training data well. Rather, the goodness of a model is characterized by its generalization capability, interpretability and ease for knowledge extraction. All these desired properties depend crucially on the ability to construct appropriate parsimonious models by the modelling process, and a basic principle in practical nonlinear data modelling is the parsimonious principle of ensuring the smallest possible model that explains the training data. There exists a vast amount of works in the area of sparse modelling, and a widely adopted approach is based on the linear-in-the-parameters data modelling that include the radial basis function network, the neurofuzzy network and all the sparse kernel modelling techniques. A well tested strategy for parsimonious modelling from data is the orthogonal least squares (OLS) algorithm for forward selection modelling, which is capable of constructing sparse models that generalise well. This contribution continues this theme and provides a unified framework for sparse modelling from data that includes regression and classification, which belong to supervised learning, and probability density function estimation, which is an unsupervised learning problem. The OLS forward selection method based on the leave-one-out test criteria is presented within this unified data-modelling framework. Examples from regression, classification and density estimation applications are used to illustrate the effectiveness of this generic parsimonious modelling approach from data.
基金supported by the Natural Sciences and Engineering Research Council of Canadathe National Natural Science Foundation of China+2 种基金the Doctorial Fund of Education Ministry of Chinasupported by the Natural Sciences and Engineering Research Council of Canadasupported by the National Natural Science Foundation of China
文摘Recursive algorithms are very useful for computing M-estimators of regression coefficients and scatter parameters. In this article, it is shown that for a nondecreasing ul (t), under some mild conditions the recursive M-estimators of regression coefficients and scatter parameters are strongly consistent and the recursive M-estimator of the regression coefficients is also asymptotically normal distributed. Furthermore, optimal recursive M-estimators, asymptotic efficiencies of recursive M-estimators and asymptotic relative efficiencies between recursive M-estimators of regression coefficients are studied.
文摘Multinomial logistic regression (MNL) is an attractive statistical approach in modeling the vehicle crash severity as it does not require the assumption of normality, linearity, or homoscedasticity compared to other approaches, such as the discriminant analysis which requires these assumptions to be met. Moreover, it produces sound estimates by changing the probability range between 0.0 and 1.0 to log odds ranging from negative infinity to positive infinity, as it applies transformation of the dependent variable to a continuous variable. The estimates are asymptotically consistent with the requirements of the nonlinear regression process. The results of MNL can be interpreted by both the regression coefficient estimates and/or the odd ratios (the exponentiated coefficients) as well. In addition, the MNL can be used to improve the fitted model by comparing the full model that includes all predictors to a chosen restricted model by excluding the non-significant predictors. As such, this paper presents a detailed step by step overview of incorporating the MNL in crash severity modeling, using vehicle crash data of the Interstate I70 in the State of Missouri, USA for the years (2013-2015).
基金Supported by the National Natural Science Foundation of China(10571008)Supported by the Natural Science Foundation of Henan(0511013300)Supported by the National Science Foundation of Henan Education Department(2006110012)
文摘In this paper, we consider the following semipaxametric regression model under fixed design: yi = xi′β+g(xi)+ei. The estimators of β, g(·) and σ^2 axe obtained by using the least squares and usual nonparametric weight function method and their strong consistency is proved under the suitable conditions.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
文摘Municipal solid waste generation is strongly linked to rising human population and expanding urban areas, with significant implications on urban metabolism as well as space and place values redefinition. Effective management performance of municipal solid waste management underscores the interdisciplinarity strategies. Such knowledge and skills are paramount to uncover the sources of waste generation as well as means of waste storage, collection, recycling, transportation, handling/treatment, disposal, and monitoring. This study was conducted in Dar es Salaam city. Driven by the curiosity model of the solid waste minimization performance at source, study data was collected using focus group discussion techniques to ward-level local government officers, which was triangulated with literature and documentary review. The main themes of the FGD were situational factors (SFA) and local government by-laws (LGBY). In the FGD session, sub-themes of SFA tricked to understand how MSW minimization is related to the presence and effect of services such as land use planning, availability of landfills, solid waste transfer stations, material recovery facilities, incinerators, solid waste collection bins, solid waste trucks, solid waste management budget and solid waste collection agents. Similarly, FGD on LGBY was extended by sub-themes such as contents of the by-law, community awareness of the by-law, and by-law enforcement mechanisms. While data preparation applied an analytical hierarchy process, data analysis applied an ordinary least square (OLS) regression model for sub-criteria that explain SFA and LGBY;and OLS standard residues as variables into geographically weighted regression with a resolution of 241 × 241 meter in ArcMap v10.5. Results showed that situational factors and local government by-laws have a strong relationship with the rate of minimizing solid waste dumping in water bodies (local R square = 0.94).
基金Supported by the National Social Science Foundation of China(Grant No.22BTJ059)。
文摘In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to be estimated,random errorsε_(i)are(α,β)-mix_(i)ng random variables.The p-th(p>1)mean consistency,strong consistency and complete consistency for least squares estimators ofβ^(*)and g(·)are investigated under some mild conditions.In addition,a numerical simulation is carried out to study the finite sample performance of the theoretical results.Finally,a real data analysis is provided to further verify the effect of the model.
文摘Statistical downscaling (SD) analyzes relationship between local-scale response and global-scale predictors. The SD model can be used to forecast rainfall (local-scale) using global-scale precipitation from global circulation model output (GCM). The objectives of this research were to determine the time lag of GCM data and build SD model using PCR method with time lag of the GCM precipitation data. The observations of rainfall data in Indramayu were taken from 1979 to 2007 showing similar patterns with GCM data on 1st grid to 64th grid after time shift (time lag). The time lag was determined using the cross-correlation function. However, GCM data of 64 grids showed multicollinearity problem. This problem was solved by principal component regression (PCR), but the PCR model resulted heterogeneous errors. PCR model was modified to overcome the errors with adding dummy variables to the model. Dummy variables were determined based on partial least squares regression (PLSR). The PCR model with dummy variables improved the rainfall prediction. The SD model with lag-GCM predictors was also better than SD model without lag-GCM.
文摘The objective of this work is to model statistically the ultraviolet radiation index (UV Index) to make forecast (extrapolate) and analyze trends. The task is relevant, due to increased UV flux and high rate of cases non-melanoma skin cancer in northeast of Brazil. The methodology utilized an Autoregressive Distributed Lag model (ADL) or Dynamic Linear Regression model. The monthly data of UV index were measured in east coast of the Brazilian Northeast (City of Natal-Rio Grande do Norte). The Total Ozone is single explanatory variable to model and was obtained from the TOMS and OMI/AURA instruments. The Predictive Mean Matching (PMM) method was used to complete the missing data of UV Index. The results mean squared error (MSE) between the observed UV index and interpolated data by model was of 0.36 and for extrapolation was of 0.30 with correlations of 0.90 and 0.91 respectively. The forecast/extrapolation performed by model for a climatological period (2012-2042) indicated a trend of increased UV (Seasonal Man-Kendall test scored τ = 0.955 and p-value 0.001) if the Total Ozone remain on this tendency to reduce. In those circumstances, the model indicated an increase of almost one unit of UV index to year 2042.
文摘Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.
文摘We construct a fuzzy varying coefficient bilinear regression model to deal with the interval financial data and then adopt the least-squares method based on symmetric fuzzy number space. Firstly, we propose a varying coefficient model on the basis of the fuzzy bilinear regression model. Secondly, we develop the least-squares method according to the complete distance between fuzzy numbers to estimate the coefficients and test the adaptability of the proposed model by means of generalized likelihood ratio test with SSE composite index. Finally, mean square errors and mean absolutely errors are employed to evaluate and compare the fitting of fuzzy auto regression, fuzzy bilinear regression and fuzzy varying coefficient bilinear regression models, and also the forecasting of three models. Empirical analysis turns out that the proposed model has good fitting and forecasting accuracy with regard to other regression models for the capital market.
基金This work was supported by the National Natural Science Foundation of China (Grant No. 19971001).
文摘We study the parameter estimation in a nonlinear regression model with a general error's structure,strong consistency and strong consistency rate of the least squares estimator are obtained.
基金supported by a grant from the Natural Sciences and Engineering Research Council of Canada.
文摘Consider a repeated measurement partially linear regression model with anunknown vector parameter β_1, an unknown function g(·), and unknown heteroscedastic errorvariances. In order to improve the semiparametric generalized least squares estimator (SGLSE) of ,we propose an iterative weighted semiparametric least squares estimator (IWSLSE) and show that itimproves upon the SGLSE in terms of asymptotic covariance matrix. An adaptive procedure is given todetermine the number of iterations. We also show that when the number of replicates is less than orequal to two, the IWSLSE can not improve upon the SGLSE. These results are generalizations of thosein [2] to the case of semiparametric regressions.