In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard n...In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard normal distribution. We get the EB estimators by using kernel estimation of multivariate density function and its first order partial derivatives. It is shown that the convergence rates of the EB estimators are under the condition where an integer k > 1 . is an arbitrary small number and m is the dimension of the vector Y.展开更多
Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed...Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed by McIntyre[1952.A method for unbiased selective sampling,using ranked sets.Australian Journal of Agricultural Research 3,385-390]as an effective way to estimate the pasture mean.In the current paper,a modification of ranked set sampling called moving extremes ranked set sampling(MERSS)is considered for the best linear unbiased estimators(BLUEs)for the simple linear regression model.The BLUEs for this model under MERSS are derived.The BLUEs under MERSS are shown to be markedly more efficient for normal data when compared with the BLUEs under simple random sampling.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to elimin...The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.展开更多
Observed rainfall is a very essential parameter for the analysis of rainfall,day to day weather forecast and its validation.The observed rainfall data is only available from five observatories of IMD;while no rainfall...Observed rainfall is a very essential parameter for the analysis of rainfall,day to day weather forecast and its validation.The observed rainfall data is only available from five observatories of IMD;while no rainfall data is available at various important locations in and around Delhi-NCR.However,the 24-hour rainfall data observed by Doppler Weather Radar(DWR)for entire Delhi and surrounding region(up to 150 km)is readily available in a pictorial form.In this paper,efforts have been made to derive/estimate the rainfall at desired locations using DWR hydrological products.Firstly,the rainfall at desired locations has been estimated from the precipitation accumulation product(PAC)of the DWR using image processing in Python language.After this,a linear regression model using the least square method has been developed in R language.Estimated and observed rainfall data of year 2018(July,August and September)was used to train the model.After this,the model was tested on rainfall data of year 2019(July,August and September)and validated.With the use of linear regression model,the error in mean rainfall estimation reduced by 46.58% and the error in max rainfall estimation reduced by 84.53% for the year 2019.The error in mean rainfall estimation reduced by 81.36% and the error in max rainfall estimation reduced by 33.81%for the year 2018.Thus,the rainfall can be estimated with a fair degree of accuracy at desired locations within the range of the Doppler Weather Radar using the radar rainfall products and the developed linear regression model.展开更多
Estimation methods have over the years been a problem for Statistician especially in sectors that have to do with Hidden/Hard-to-Reach population. In this paper, a regression model was derived for Elusive/Hard-to-Reac...Estimation methods have over the years been a problem for Statistician especially in sectors that have to do with Hidden/Hard-to-Reach population. In this paper, a regression model was derived for Elusive/Hard-to-Reach/Hidden populations. This was achieved by modelling the Multiplicity Estimator given by Birnbaum and Sirken (1965) into a regression model. The paper also gave the least-squares estimation of the unknown parameters β0 and β1, and σ2.展开更多
This paper selects seven indicators of financial revenue and housing sales price in recent 19 years in China,and uses SPSS and Excel to carry out descriptive statistics,independent sample t-test,correlation analysis a...This paper selects seven indicators of financial revenue and housing sales price in recent 19 years in China,and uses SPSS and Excel to carry out descriptive statistics,independent sample t-test,correlation analysis and regression analysis to comprehensively study the correlation between financial revenue and housing sales price in China,and establishes the relationship between financial revenue and housing sales price When the average selling price of commercial housing increases by one unit,the fiscal revenue will increase by 27.855 points.展开更多
Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea ...Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.展开更多
The purpose of this study was to determine a suitable model for investigating the effects of climate factors on the area burned by forest fire in the Tahe forest region, Daxing'an Mountains, in northeast China. The r...The purpose of this study was to determine a suitable model for investigating the effects of climate factors on the area burned by forest fire in the Tahe forest region, Daxing'an Mountains, in northeast China. The response variables were the area burned by lightning- caused fire, human-caused fire, and total burned area. The predictor variables were nine climate variables collected from the local weather station. Three regression models were utilized, including multiple linear regression, log- linear model (log-transformation on both response and predictor variables), and gamma-generalized linear model. The goodness-of-fit of the models were compared based on model fitting statistics such as R2, AIC, and RMSE. The results revealed that the gamma-generalized linear model was generally superior to both multiple linear regressionmodel and log-linear model for fitting the fire data. Further, the best models were selected based on the criteria that the climate variables were statistically significant at at = 0.05. The gamma best models indicated that maximum wind speed, precipitation, and days that rainfall greater than 0.1 mm had significant impacts on the area burned by the lightning-caused fire, while the mean temperature and minimum relative humidity were the .main drivers of the burned area caused by human activities. Overall, the total burned area by forest fire was significantly influenced by days that rainfall greater than 0.1 mm and minimum rela- tive humidity, indicating that the moisture condition of forest stands determine the burned area by forest fire.展开更多
Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated wa...Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated water. This study focuses on the application of statistical techniques, Multiple Linear Regression model and MANOVA to assess health impacts due to pollution in Cauvery river stretch in Srirangapatna. In this study, using Multiple Linear Regression, it is found that health impact level is 60.8% dependent on water quality parameters of BOD, COD, TDS, TC and FC. The t-statistics and their associated 2-tailed p-values indicate that COD and TDS produces health impacts compared to BOD, TC and FC, when their effects are put together across all the six sampling stations in Srirangapatna. Further Pearson correlation Matrix shows highly significant positive correlation amongst parameters across all stations indicating possibility of common sources of origin that might be anthropogenic. Also graphs are plotted for individual parameters across all stations and it reveals that COD and TDS values are significant across all sampling stations, though their values are higher in impact stations, causing health impacts.展开更多
This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s in...This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s input and output are fuzzy numbers, and the regression coefficients are clear numbers. This paper considers the parameter estimation and impact analysis based on data deletion. Through the study of example and comparison with other models, it can be concluded that the model in this paper is applied easily and better.展开更多
This paper uses a grouping-adjusting procedure to the data from a median linear regression model, and estimtes the regression coefficients by the method of weighted least squares. This method simplifies computation an...This paper uses a grouping-adjusting procedure to the data from a median linear regression model, and estimtes the regression coefficients by the method of weighted least squares. This method simplifies computation and in the meantime, preserves the same asymptotic normal distribution for the estimator, as in the ordinary minimum L_1-norm estimates.展开更多
In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not...In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.展开更多
Recursive algorithms are very useful for computing M-estimators of regression coefficients and scatter parameters. In this article, it is shown that for a nondecreasing ul (t), under some mild conditions the recursi...Recursive algorithms are very useful for computing M-estimators of regression coefficients and scatter parameters. In this article, it is shown that for a nondecreasing ul (t), under some mild conditions the recursive M-estimators of regression coefficients and scatter parameters are strongly consistent and the recursive M-estimator of the regression coefficients is also asymptotically normal distributed. Furthermore, optimal recursive M-estimators, asymptotic efficiencies of recursive M-estimators and asymptotic relative efficiencies between recursive M-estimators of regression coefficients are studied.展开更多
Recurrent event gap times data frequently arise in biomedical studies and often more than one type of event is of interest. To evaluate the effects of covariates on the marginal recurrent event hazards functions, ther...Recurrent event gap times data frequently arise in biomedical studies and often more than one type of event is of interest. To evaluate the effects of covariates on the marginal recurrent event hazards functions, there exist two types of hazards models: the multiplicative hazards model and the additive hazards model. In the paper, we propose a more flexible additive-multiplicative hazards model for multiple type of recurrent gap times data, wherein some covariates are assumed to be additive while others are multiplicative. An estimating equation approach is presented to estimate the regression parameters. We establish asymptotic properties of the proposed estimators.展开更多
Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed dat...Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.展开更多
The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This not...The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This notwithstanding, regression analysis may aim at prediction. Consequently, this paper examines the performances of the Ordinary Least Square (OLS) estimator, Cochrane-Orcutt (COR) estimator, Maximum Likelihood (ML) estimator and the estimators based on Principal Component (PC) analysis in prediction of linear regression model under the joint violations of the assumption of non-stochastic regressors, independent regressors and error terms. With correlated stochastic normal variables as regressors and autocorrelated error terms, Monte-Carlo experiments were conducted and the study further identifies the best estimator that can be used for prediction purpose by adopting the goodness of fit statistics of the estimators. From the results, it is observed that the performances of COR at each level of correlation (multicollinearity) and that of ML, especially when the sample size is large, over the levels of autocorrelation have a convex-like pattern while that of OLS and PC are concave-like. Also, as the levels of multicollinearity increase, the estimators, except the PC estimators when multicollinearity is negative, rapidly perform better over the levels autocorrelation. The COR and ML estimators are generally best for prediction in the presence of multicollinearity and autocorrelated error terms. However, at low levels of autocorrelation, the OLS estimator is either best or competes consistently with the best estimator, while the PC estimator is either best or competes with the best when multicollinearity level is high(λ>0.8 or λ-0.49).展开更多
In this paper we consider a linear regression model with fixed design. A new rule for the selection of a relevant submodel is introduced on the basis of parameter tests. One particular feature of the rule is that subj...In this paper we consider a linear regression model with fixed design. A new rule for the selection of a relevant submodel is introduced on the basis of parameter tests. One particular feature of the rule is that subjective grading of the model complexity can be incorporated. We provide bounds for the mis-selection error. Simulations show that by using the proposed selection rule, the mis-selection error can be controlled uniformly.展开更多
The aim of this paper is to propose some diagnostic methods in stochastic restricted linear regression models. A review of stochastic restricted linear regression models is given. For the model, this paper studies the...The aim of this paper is to propose some diagnostic methods in stochastic restricted linear regression models. A review of stochastic restricted linear regression models is given. For the model, this paper studies the method and application of the diagnostic mostly. Firstly, review the estimators of this model. Secondly, show that the case deletion model is equivalent to the mean shift outlier model for diagnostic purpose. Then, some diagnostic statistics are given. At last, example is given to illustrate our results.展开更多
This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression model...This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.展开更多
文摘In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard normal distribution. We get the EB estimators by using kernel estimation of multivariate density function and its first order partial derivatives. It is shown that the convergence rates of the EB estimators are under the condition where an integer k > 1 . is an arbitrary small number and m is the dimension of the vector Y.
基金Supported by the National Natural Science Foundation of China(11901236)the Scientific Research Fund of Hunan Provincial Science and Technology Department(2019JJ50479)+3 种基金the Scientific Research Fund of Hunan Provincial Education Department(18B322)the Winning Bid Project of Hunan Province for the 4th National Economic Census([2020]1)the Young Core Teacher Foundation of Hunan Province([2020]43)the Funda-mental Research Fund of Xiangxi Autonomous Prefecture(2018SF5026)。
文摘Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed by McIntyre[1952.A method for unbiased selective sampling,using ranked sets.Australian Journal of Agricultural Research 3,385-390]as an effective way to estimate the pasture mean.In the current paper,a modification of ranked set sampling called moving extremes ranked set sampling(MERSS)is considered for the best linear unbiased estimators(BLUEs)for the simple linear regression model.The BLUEs for this model under MERSS are derived.The BLUEs under MERSS are shown to be markedly more efficient for normal data when compared with the BLUEs under simple random sampling.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金supported by the National Natural Science Foundation of China(71071077)the Ministry of Education Key Project of National Educational Science Planning(DFA090215)+1 种基金China Postdoctoral Science Foundation(20100481137)Funding of Jiangsu Innovation Program for Graduate Education(CXZZ11-0226)
文摘The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.
文摘Observed rainfall is a very essential parameter for the analysis of rainfall,day to day weather forecast and its validation.The observed rainfall data is only available from five observatories of IMD;while no rainfall data is available at various important locations in and around Delhi-NCR.However,the 24-hour rainfall data observed by Doppler Weather Radar(DWR)for entire Delhi and surrounding region(up to 150 km)is readily available in a pictorial form.In this paper,efforts have been made to derive/estimate the rainfall at desired locations using DWR hydrological products.Firstly,the rainfall at desired locations has been estimated from the precipitation accumulation product(PAC)of the DWR using image processing in Python language.After this,a linear regression model using the least square method has been developed in R language.Estimated and observed rainfall data of year 2018(July,August and September)was used to train the model.After this,the model was tested on rainfall data of year 2019(July,August and September)and validated.With the use of linear regression model,the error in mean rainfall estimation reduced by 46.58% and the error in max rainfall estimation reduced by 84.53% for the year 2019.The error in mean rainfall estimation reduced by 81.36% and the error in max rainfall estimation reduced by 33.81%for the year 2018.Thus,the rainfall can be estimated with a fair degree of accuracy at desired locations within the range of the Doppler Weather Radar using the radar rainfall products and the developed linear regression model.
文摘Estimation methods have over the years been a problem for Statistician especially in sectors that have to do with Hidden/Hard-to-Reach population. In this paper, a regression model was derived for Elusive/Hard-to-Reach/Hidden populations. This was achieved by modelling the Multiplicity Estimator given by Birnbaum and Sirken (1965) into a regression model. The paper also gave the least-squares estimation of the unknown parameters β0 and β1, and σ2.
基金Thank you for your valuable comments and suggestions.This research was supported by Yunnan applied basic research project(NO.2017FD150)Chuxiong Normal University General Research Project(NO.XJYB2001).
文摘This paper selects seven indicators of financial revenue and housing sales price in recent 19 years in China,and uses SPSS and Excel to carry out descriptive statistics,independent sample t-test,correlation analysis and regression analysis to comprehensively study the correlation between financial revenue and housing sales price in China,and establishes the relationship between financial revenue and housing sales price When the average selling price of commercial housing increases by one unit,the fiscal revenue will increase by 27.855 points.
基金The National Natural Science Foundation of China under contract No.11174235the Science and Technology Development Project of Shaanxi Province of China under contract No.2010KJXX-02+2 种基金the Program for New Century Excellent Talents in University of China under contract No. NCET-08-0455the Science and Technology Innovation Foundation of Northwestern Polytechnical University of Chinathe Doctorate Foundation of Northwestern Polytechnical University of China under contract No.CX201226.
文摘Multiple linear regression (MLR) method was applied to quantify the effects of the net heat flux (NHF), the net freshwater flux (NFF) and the wind stress on the mixed layer depth (MLD) of the South China Sea (SCS) based on the simple ocean data assimilation (SODA) dataset. The spatio-temporal distributions of the MLD, the buoyancy flux (combining the NHF and the NFF) and the wind stress of the SCS were presented. Then using an oceanic vertical mixing model, the MLD after a certain time under the same initial conditions but various pairs of boundary conditions (the three factors) was simulated. Applying the MLR method to the results, regression equations which modeling the relationship between the simulated MLD and the three factors were calculated. The equations indicate that when the NHF was negative, it was the primary driver of the mixed layer deepening; and when the NHF was positive, the wind stress played a more important role than that of the NHF while the NFF had the least effect. When the NHF was positive, the relative quantitative effects of the wind stress, the NHF, and the NFF were about i0, 6 and 2. The above conclusions were applied to explaining the spatio-temporal distributions of the MLD in the SCS and thus proved to be valid.
基金funded by Asia-Pacific Forests Net(APFNET/2010/FPF/001)National Natural Science Foundation of China(Grant No.31400552)Forestry industry research special funds for public welfare projects(201404402)
文摘The purpose of this study was to determine a suitable model for investigating the effects of climate factors on the area burned by forest fire in the Tahe forest region, Daxing'an Mountains, in northeast China. The response variables were the area burned by lightning- caused fire, human-caused fire, and total burned area. The predictor variables were nine climate variables collected from the local weather station. Three regression models were utilized, including multiple linear regression, log- linear model (log-transformation on both response and predictor variables), and gamma-generalized linear model. The goodness-of-fit of the models were compared based on model fitting statistics such as R2, AIC, and RMSE. The results revealed that the gamma-generalized linear model was generally superior to both multiple linear regressionmodel and log-linear model for fitting the fire data. Further, the best models were selected based on the criteria that the climate variables were statistically significant at at = 0.05. The gamma best models indicated that maximum wind speed, precipitation, and days that rainfall greater than 0.1 mm had significant impacts on the area burned by the lightning-caused fire, while the mean temperature and minimum relative humidity were the .main drivers of the burned area caused by human activities. Overall, the total burned area by forest fire was significantly influenced by days that rainfall greater than 0.1 mm and minimum rela- tive humidity, indicating that the moisture condition of forest stands determine the burned area by forest fire.
文摘Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated water. This study focuses on the application of statistical techniques, Multiple Linear Regression model and MANOVA to assess health impacts due to pollution in Cauvery river stretch in Srirangapatna. In this study, using Multiple Linear Regression, it is found that health impact level is 60.8% dependent on water quality parameters of BOD, COD, TDS, TC and FC. The t-statistics and their associated 2-tailed p-values indicate that COD and TDS produces health impacts compared to BOD, TC and FC, when their effects are put together across all the six sampling stations in Srirangapatna. Further Pearson correlation Matrix shows highly significant positive correlation amongst parameters across all stations indicating possibility of common sources of origin that might be anthropogenic. Also graphs are plotted for individual parameters across all stations and it reveals that COD and TDS values are significant across all sampling stations, though their values are higher in impact stations, causing health impacts.
文摘This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s input and output are fuzzy numbers, and the regression coefficients are clear numbers. This paper considers the parameter estimation and impact analysis based on data deletion. Through the study of example and comparison with other models, it can be concluded that the model in this paper is applied easily and better.
基金Research supported By AFOSC, USA, under Contract F49620-85-0008oy NNSFC of China.
文摘This paper uses a grouping-adjusting procedure to the data from a median linear regression model, and estimtes the regression coefficients by the method of weighted least squares. This method simplifies computation and in the meantime, preserves the same asymptotic normal distribution for the estimator, as in the ordinary minimum L_1-norm estimates.
文摘In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.
基金supported by the Natural Sciences and Engineering Research Council of Canadathe National Natural Science Foundation of China+2 种基金the Doctorial Fund of Education Ministry of Chinasupported by the Natural Sciences and Engineering Research Council of Canadasupported by the National Natural Science Foundation of China
文摘Recursive algorithms are very useful for computing M-estimators of regression coefficients and scatter parameters. In this article, it is shown that for a nondecreasing ul (t), under some mild conditions the recursive M-estimators of regression coefficients and scatter parameters are strongly consistent and the recursive M-estimator of the regression coefficients is also asymptotically normal distributed. Furthermore, optimal recursive M-estimators, asymptotic efficiencies of recursive M-estimators and asymptotic relative efficiencies between recursive M-estimators of regression coefficients are studied.
基金The Science Foundation(JA12301)of Fujian Educational Committeethe Teaching Quality Project(ZL0902/TZ(SJ))of Higher Education in Fujian Provincial Education Department
文摘Recurrent event gap times data frequently arise in biomedical studies and often more than one type of event is of interest. To evaluate the effects of covariates on the marginal recurrent event hazards functions, there exist two types of hazards models: the multiplicative hazards model and the additive hazards model. In the paper, we propose a more flexible additive-multiplicative hazards model for multiple type of recurrent gap times data, wherein some covariates are assumed to be additive while others are multiplicative. An estimating equation approach is presented to estimate the regression parameters. We establish asymptotic properties of the proposed estimators.
文摘Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS) and weighed least-squares regression (WLS). All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding β1.
文摘The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This notwithstanding, regression analysis may aim at prediction. Consequently, this paper examines the performances of the Ordinary Least Square (OLS) estimator, Cochrane-Orcutt (COR) estimator, Maximum Likelihood (ML) estimator and the estimators based on Principal Component (PC) analysis in prediction of linear regression model under the joint violations of the assumption of non-stochastic regressors, independent regressors and error terms. With correlated stochastic normal variables as regressors and autocorrelated error terms, Monte-Carlo experiments were conducted and the study further identifies the best estimator that can be used for prediction purpose by adopting the goodness of fit statistics of the estimators. From the results, it is observed that the performances of COR at each level of correlation (multicollinearity) and that of ML, especially when the sample size is large, over the levels of autocorrelation have a convex-like pattern while that of OLS and PC are concave-like. Also, as the levels of multicollinearity increase, the estimators, except the PC estimators when multicollinearity is negative, rapidly perform better over the levels autocorrelation. The COR and ML estimators are generally best for prediction in the presence of multicollinearity and autocorrelated error terms. However, at low levels of autocorrelation, the OLS estimator is either best or competes consistently with the best estimator, while the PC estimator is either best or competes with the best when multicollinearity level is high(λ>0.8 or λ-0.49).
文摘In this paper we consider a linear regression model with fixed design. A new rule for the selection of a relevant submodel is introduced on the basis of parameter tests. One particular feature of the rule is that subjective grading of the model complexity can be incorporated. We provide bounds for the mis-selection error. Simulations show that by using the proposed selection rule, the mis-selection error can be controlled uniformly.
文摘The aim of this paper is to propose some diagnostic methods in stochastic restricted linear regression models. A review of stochastic restricted linear regression models is given. For the model, this paper studies the method and application of the diagnostic mostly. Firstly, review the estimators of this model. Secondly, show that the case deletion model is equivalent to the mean shift outlier model for diagnostic purpose. Then, some diagnostic statistics are given. At last, example is given to illustrate our results.
文摘This paper considers the approaches and methods for reducing the influence of multi-collinearity. Great attention is paid to the question of using shrinkage estimators for this purpose. Two classes of regression models are investigated, the first of which corresponds to systems with a negative feedback, while the second class presents systems without the feedback. In the first case the use of shrinkage estimators, especially the Principal Component estimator, is inappropriate but is possible in the second case with the right choice of the regularization parameter or of the number of principal components included in the regression model. This fact is substantiated by the study of the distribution of the random variable , where b is the LS estimate and β is the true coefficient, since the form of this distribution is the basic characteristic of the specified classes. For this study, a regression approximation of the distribution of the event based on the Edgeworth series was developed. Also, alternative approaches are examined to resolve the multicollinearity issue, including an application of the known Inequality Constrained Least Squares method and the Dual estimator method proposed by the author. It is shown that with a priori information the Euclidean distance between the estimates and the true coefficients can be significantly reduced.