In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
This paper uses a grouping-adjusting procedure to the data from a median linear regression model, and estimtes the regression coefficients by the method of weighted least squares. This method simplifies computation an...This paper uses a grouping-adjusting procedure to the data from a median linear regression model, and estimtes the regression coefficients by the method of weighted least squares. This method simplifies computation and in the meantime, preserves the same asymptotic normal distribution for the estimator, as in the ordinary minimum L_1-norm estimates.展开更多
This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s in...This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s input and output are fuzzy numbers, and the regression coefficients are clear numbers. This paper considers the parameter estimation and impact analysis based on data deletion. Through the study of example and comparison with other models, it can be concluded that the model in this paper is applied easily and better.展开更多
The data on the coal production and consumption in Jilin Province for the last ten years were collected,and the Grey System GM( 1,1) model and unary linear regression model were applied to predict the coal consumption...The data on the coal production and consumption in Jilin Province for the last ten years were collected,and the Grey System GM( 1,1) model and unary linear regression model were applied to predict the coal consumption of Jilin Production in 2014 and 2015. Through calculation,the predictive value on the coal consumption of Jilin Province was attained,namely consumption of 2014 is 114. 84 × 106 t and of 2015 is 117. 98 ×106t,respectively. Analysis of error data indicated that the predicted accuracy of Grey System GM( 1,1) model on the coal consumption in Jilin Province improved 0. 21% in comparison to unary linear regression model.展开更多
When the population, from which the samples are extracted, is not normally distributed, or if the sample size is particularly reduced, become preferable the use of not parametric statistic test. An alternative to the ...When the population, from which the samples are extracted, is not normally distributed, or if the sample size is particularly reduced, become preferable the use of not parametric statistic test. An alternative to the normal model is the permutation or randomization model. The permutation model is nonparametric because no formal assumptions are made about the population parameters of the reference distribution, i.e., the distribution to which an obtained result is compared to determine its probability when the null hypothesis is true. Typically the reference distribution is a sampling distribution for parametric tests and a permutation distribution for many nonparametric tests. Within the regression models, it is possible to use the permutation tests, considering their ownerships of optimality, especially in the multivariate context and the normal distribution of the response variables is not guaranteed. In the literature there are numerous permutation tests applicable to the estimation of the regression models. The purpose of this study is to examine different kinds of permutation tests applied to linear models, focused our attention on the specific test statistic on which they are based. In this paper we focused our attention on permutation test of the independent variables, proposed by Oja, and other methods to effect the inference in non parametric way, in a regression model. Moreover, we show the recent advances in this context and try to compare them.展开更多
Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed...Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed by McIntyre[1952.A method for unbiased selective sampling,using ranked sets.Australian Journal of Agricultural Research 3,385-390]as an effective way to estimate the pasture mean.In the current paper,a modification of ranked set sampling called moving extremes ranked set sampling(MERSS)is considered for the best linear unbiased estimators(BLUEs)for the simple linear regression model.The BLUEs for this model under MERSS are derived.The BLUEs under MERSS are shown to be markedly more efficient for normal data when compared with the BLUEs under simple random sampling.展开更多
Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studie...Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation ofpiecewise linear regression models. The method used to estimate the parameters ofpicewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC (Marcov Chain Monte Carlo) algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters ofpicewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.展开更多
This paper selects seven indicators of financial revenue and housing sales price in recent 19 years in China,and uses SPSS and Excel to carry out descriptive statistics,independent sample t-test,correlation analysis a...This paper selects seven indicators of financial revenue and housing sales price in recent 19 years in China,and uses SPSS and Excel to carry out descriptive statistics,independent sample t-test,correlation analysis and regression analysis to comprehensively study the correlation between financial revenue and housing sales price in China,and establishes the relationship between financial revenue and housing sales price When the average selling price of commercial housing increases by one unit,the fiscal revenue will increase by 27.855 points.展开更多
In this paper, we study some robustness aspects of linear regression models of the presence of outliers or discordant observations considering the use of stable distributions for the response in place of the usual nor...In this paper, we study some robustness aspects of linear regression models of the presence of outliers or discordant observations considering the use of stable distributions for the response in place of the usual normality assumption. It is well known that, in general, there is no closed form for the probability density function of stable distributions. However, under a Bayesian approach, the use of a latent or auxiliary random variable gives some simplification to obtain any posterior distribution when related to stable distributions. To show the usefulness of the computational aspects, the methodology is applied to two examples: one is related to a standard linear regression model with an explanatory variable and the other is related to a simulated data set assuming a 23 factorial experiment. Posterior summaries of interest are obtained using MCMC (Markov Chain Monte Carlo) methods and the OpenBugs software.展开更多
In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to...In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to be estimated,random errorsε_(i)are(α,β)-mix_(i)ng random variables.The p-th(p>1)mean consistency,strong consistency and complete consistency for least squares estimators ofβ^(*)and g(·)are investigated under some mild conditions.In addition,a numerical simulation is carried out to study the finite sample performance of the theoretical results.Finally,a real data analysis is provided to further verify the effect of the model.展开更多
Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model int...Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.展开更多
In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not...In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.展开更多
Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calcu...Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.展开更多
Consider a partially linear regression model with an unknown vector parameter , an unknown function g(·), and unknown heteroscedastic error variances. Chen, You<SUP>[23]</SUP> proposed a semiparametri...Consider a partially linear regression model with an unknown vector parameter , an unknown function g(·), and unknown heteroscedastic error variances. Chen, You<SUP>[23]</SUP> proposed a semiparametric generalized least squares estimator (SGLSE) for , which takes the heteroscedasticity into account to increase efficiency. For inference based on this SGLSE, it is necessary to construct a consistent estimator for its asymptotic covariance matrix. However, when there exists within-group correlation, the traditional delta method and the delete-1 jackknife estimation fail to offer such a consistent estimator. In this paper, by deleting grouped partial residuals a delete-group jackknife method is examined. It is shown that the delete-group jackknife method indeed can provide a consistent estimator for the asymptotic covariance matrix in the presence of within-group correlations. This result is an extension of that in [21].展开更多
We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are...We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are given. The strong convergence rates of the proposed estimators are obtained. In our estimation, the observation number of each subject will be completely flexible. Some simulation study is conducted to investigate the finite sample performance of the proposed estimators.展开更多
Mathematical modeling of economic indices is a challenging topic in crop production systems.The present study aimed to model the economic indices of mechanized and semimechanized rainfed wheat production systems using...Mathematical modeling of economic indices is a challenging topic in crop production systems.The present study aimed to model the economic indices of mechanized and semimechanized rainfed wheat production systems using various multiple linear regression models.The study area was Behshahr County located in the east of Mazandaran Province,Northern Iran.The statistical population included all wheat producers in Behshahr County in 2016/17 crop year.Five input variables were human labor,machinery,diesel fuel,chemical(chemical fertilizers and chemical pesticides)costs,and the income was considered to be the output.The results showed that the cost of wheat production in the semimechanized system was higher than that of the mechanized system.In both systems,the highest cost was related to agricultural machinery input.Moreover,seed cost was lower in the mechanized system than that of the semi-mechanized system.The net return indicator was 993.68$ha1 and 626.71$ha1 for the mechanized and semi-mechanized systems,respectively.The average benefit to cost ratio was 3.46 and 2.40 for the mechanized and semi-mechanized systems,respectively,demonstrating the greater profitability of the mechanized system.The results of the evaluation of five types of regression models including the Cobb-Douglas,linear,2FI,quadratic and pure-quadratic for the mechanized and semi-mechanized production systems indicated that in the developed Cobb-Douglas model,the R2-value was higher than that of the quadratic model while RMSE and MAPE of the quadratic model were determined to be smaller than that of the Cobb-Douglas model.Therefore,the best model to investigate the relationship between input costs and the income of wheat production in both mechanized and semi-mechanized systems was the quadratic model.展开更多
In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard n...In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard normal distribution. We get the EB estimators by using kernel estimation of multivariate density function and its first order partial derivatives. It is shown that the convergence rates of the EB estimators are under the condition where an integer k > 1 . is an arbitrary small number and m is the dimension of the vector Y.展开更多
Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive mode...Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive models and mathematical models have been developed. Predictive tools such as regression models have been widely used to predict stormwater discharge characteristics. Storm event characteristics, such as antecedent dry days (ADD), have been related to response variables, such as pollutant loads and concentrations. However it has been a controversial issue among many studies to consider ADD as an important variable in predicting stormwater discharge characteristics. In this study, we examined the accuracy of general linear regression models in predicting discharge characteristics of roadway runoff. A total of 17 storm events were monitored in two highway segments, located in Gwangju, Korea. Data from the monitoring were used to calibrate United States Environmental Protection Agency's Storm Water Management Model (SWMM). The calibrated SWMM was simulated for 55 storm events, and the results of total suspended solid (TSS) discharge loads and event mean concentrations (EMC) were extracted. From these data, linear regression models were developed. R2 and p-values of the regression of ADD for both TSS loads and EMCs were investigated. Results showed that pollutant loads were better predicted than pollutant EMC in the multiple regression models. Regression may not provide the true effect of site-specific characteristics, due to uncertainty in the data.展开更多
The paper investigates the sequential observations’ variance change in linear regression model. The procedure is based on a detection function constructed by residual squares of CUSUM and a boundary function which is...The paper investigates the sequential observations’ variance change in linear regression model. The procedure is based on a detection function constructed by residual squares of CUSUM and a boundary function which is designed so that the test has a small probability of false alarm and asymptotic power one. Simulation results show our monitoring procedure performs well when variance change occurs shortly after the monitoring time. The method is still feasible for regression coefficients change or both variance and regression coefficients change problem.展开更多
The varying-coefficient partially linear regression model is proposed by combining nonparametric and varying-coefficient regression procedures. Wong, et al. (2008) proposed the model and gave its estimation by the l...The varying-coefficient partially linear regression model is proposed by combining nonparametric and varying-coefficient regression procedures. Wong, et al. (2008) proposed the model and gave its estimation by the local linear method. In this paper its inference is addressed. Based on these estimates, the generalized like- lihood ratio test is established. Under the null hypotheses the normalized test statistic follows a x2-distribution asymptotically, with the scale constant and the degrees of freedom being independent of the nuisance param- eters. This is the Wilks phenomenon. Furthermore its asymptotic power is also derived, which achieves the optimal rate of convergence for nonparametric hypotheses testing. A simulation and a real example are used to evaluate the performances of the testing procedures empirically.展开更多
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金Research supported By AFOSC, USA, under Contract F49620-85-0008oy NNSFC of China.
文摘This paper uses a grouping-adjusting procedure to the data from a median linear regression model, and estimtes the regression coefficients by the method of weighted least squares. This method simplifies computation and in the meantime, preserves the same asymptotic normal distribution for the estimator, as in the ordinary minimum L_1-norm estimates.
文摘This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s input and output are fuzzy numbers, and the regression coefficients are clear numbers. This paper considers the parameter estimation and impact analysis based on data deletion. Through the study of example and comparison with other models, it can be concluded that the model in this paper is applied easily and better.
基金Supported by project of National Natural Science Foundation of China(No.41272360)
文摘The data on the coal production and consumption in Jilin Province for the last ten years were collected,and the Grey System GM( 1,1) model and unary linear regression model were applied to predict the coal consumption of Jilin Production in 2014 and 2015. Through calculation,the predictive value on the coal consumption of Jilin Province was attained,namely consumption of 2014 is 114. 84 × 106 t and of 2015 is 117. 98 ×106t,respectively. Analysis of error data indicated that the predicted accuracy of Grey System GM( 1,1) model on the coal consumption in Jilin Province improved 0. 21% in comparison to unary linear regression model.
文摘When the population, from which the samples are extracted, is not normally distributed, or if the sample size is particularly reduced, become preferable the use of not parametric statistic test. An alternative to the normal model is the permutation or randomization model. The permutation model is nonparametric because no formal assumptions are made about the population parameters of the reference distribution, i.e., the distribution to which an obtained result is compared to determine its probability when the null hypothesis is true. Typically the reference distribution is a sampling distribution for parametric tests and a permutation distribution for many nonparametric tests. Within the regression models, it is possible to use the permutation tests, considering their ownerships of optimality, especially in the multivariate context and the normal distribution of the response variables is not guaranteed. In the literature there are numerous permutation tests applicable to the estimation of the regression models. The purpose of this study is to examine different kinds of permutation tests applied to linear models, focused our attention on the specific test statistic on which they are based. In this paper we focused our attention on permutation test of the independent variables, proposed by Oja, and other methods to effect the inference in non parametric way, in a regression model. Moreover, we show the recent advances in this context and try to compare them.
基金Supported by the National Natural Science Foundation of China(11901236)the Scientific Research Fund of Hunan Provincial Science and Technology Department(2019JJ50479)+3 种基金the Scientific Research Fund of Hunan Provincial Education Department(18B322)the Winning Bid Project of Hunan Province for the 4th National Economic Census([2020]1)the Young Core Teacher Foundation of Hunan Province([2020]43)the Funda-mental Research Fund of Xiangxi Autonomous Prefecture(2018SF5026)。
文摘Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed by McIntyre[1952.A method for unbiased selective sampling,using ranked sets.Australian Journal of Agricultural Research 3,385-390]as an effective way to estimate the pasture mean.In the current paper,a modification of ranked set sampling called moving extremes ranked set sampling(MERSS)is considered for the best linear unbiased estimators(BLUEs)for the simple linear regression model.The BLUEs for this model under MERSS are derived.The BLUEs under MERSS are shown to be markedly more efficient for normal data when compared with the BLUEs under simple random sampling.
文摘Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation ofpiecewise linear regression models. The method used to estimate the parameters ofpicewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC (Marcov Chain Monte Carlo) algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters ofpicewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.
基金Thank you for your valuable comments and suggestions.This research was supported by Yunnan applied basic research project(NO.2017FD150)Chuxiong Normal University General Research Project(NO.XJYB2001).
文摘This paper selects seven indicators of financial revenue and housing sales price in recent 19 years in China,and uses SPSS and Excel to carry out descriptive statistics,independent sample t-test,correlation analysis and regression analysis to comprehensively study the correlation between financial revenue and housing sales price in China,and establishes the relationship between financial revenue and housing sales price When the average selling price of commercial housing increases by one unit,the fiscal revenue will increase by 27.855 points.
基金financial support from the Brazilian Institution Conselho Nacional de Desenvolvimento Cientifico e Tecnologico(CNPq).
文摘In this paper, we study some robustness aspects of linear regression models of the presence of outliers or discordant observations considering the use of stable distributions for the response in place of the usual normality assumption. It is well known that, in general, there is no closed form for the probability density function of stable distributions. However, under a Bayesian approach, the use of a latent or auxiliary random variable gives some simplification to obtain any posterior distribution when related to stable distributions. To show the usefulness of the computational aspects, the methodology is applied to two examples: one is related to a standard linear regression model with an explanatory variable and the other is related to a simulated data set assuming a 23 factorial experiment. Posterior summaries of interest are obtained using MCMC (Markov Chain Monte Carlo) methods and the OpenBugs software.
基金Supported by the National Social Science Foundation of China(Grant No.22BTJ059)。
文摘In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to be estimated,random errorsε_(i)are(α,β)-mix_(i)ng random variables.The p-th(p>1)mean consistency,strong consistency and complete consistency for least squares estimators ofβ^(*)and g(·)are investigated under some mild conditions.In addition,a numerical simulation is carried out to study the finite sample performance of the theoretical results.Finally,a real data analysis is provided to further verify the effect of the model.
基金This work was supported by the 2021 Project of the“14th Five-Year Plan”of Shaanxi Education Science“Research on the Application of Educational Data Mining in Applied Undergraduate Teaching-Taking the Course of‘Computer Application Technology’as an Example”(SGH21Y0403)the Teaching Reform and Research Projects for Practical Teaching in 2022“Research on Practical Teaching of Applied Undergraduate Projects Based on‘Combination of Courses and Certificates”-Taking Computer Application Technology Courses as an Example”(SJJG02012)the 11th batch of Teaching Reform Research Project of Xi’an Jiaotong University City College“Project-Driven Cultivation and Research on Information Literacy of Applied Undergraduate Students in the Information Times-Taking Computer Application Technology Course Teaching as an Example”(111001).
文摘Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.
文摘In oil and gas exploration,elucidating the complex interdependencies among geological variables is paramount.Our study introduces the application of sophisticated regression analysis method at the forefront,aiming not just at predicting geophysical logging curve values but also innovatively mitigate hydrocarbon depletion observed in geochemical logging.Through a rigorous assessment,we explore the efficacy of eight regression models,bifurcated into linear and nonlinear groups,to accommodate the multifaceted nature of geological datasets.Our linear model suite encompasses the Standard Equation,Ridge Regression,Least Absolute Shrinkage and Selection Operator,and Elastic Net,each presenting distinct advantages.The Standard Equation serves as a foundational benchmark,whereas Ridge Regression implements penalty terms to counteract overfitting,thus bolstering model robustness in the presence of multicollinearity.The Least Absolute Shrinkage and Selection Operator for variable selection functions to streamline models,enhancing their interpretability,while Elastic Net amalgamates the merits of Ridge Regression and Least Absolute Shrinkage and Selection Operator,offering a harmonized solution to model complexity and comprehensibility.On the nonlinear front,Gradient Descent,Kernel Ridge Regression,Support Vector Regression,and Piecewise Function-Fitting methods introduce innovative approaches.Gradient Descent assures computational efficiency in optimizing solutions,Kernel Ridge Regression leverages the kernel trick to navigate nonlinear patterns,and Support Vector Regression is proficient in forecasting extremities,pivotal for exploration risk assessment.The Piecewise Function-Fitting approach,tailored for geological data,facilitates adaptable modeling of variable interrelations,accommodating abrupt data trend shifts.Our analysis identifies Ridge Regression,particularly when augmented by Piecewise Function-Fitting,as superior in recouping hydrocarbon losses,and underscoring its utility in resource quantification refinement.Meanwhile,Kernel Ridge Regression emerges as a noteworthy strategy in ameliorating porosity-logging curve prediction for well A,evidencing its aptness for intricate geological structures.This research attests to the scientific ascendancy and broad-spectrum relevance of these regression techniques over conventional methods while heralding new horizons for their deployment in the oil and gas sector.The insights garnered from these advanced modeling strategies are set to transform geological and engineering practices in hydrocarbon prediction,evaluation,and recovery.
基金provided by the Korean Ministry of Environment and Eco Star Project
文摘Rainfall is an important factor in estimating the event mean concentration (EMC) which is used to quantify the washed-off pollutant concentrations from non-point sources (NPSs). Pollutant loads could also be calculated using rainfall, catchment area and runoff coefficient. In this study, runoff quantity and quality data gathered from a 28-month monitoring conducted on the road and parking lot sites in Korea were evaluated using multiple linear regression (MLR) to develop equations for estimating pollutant loads and EMCs as a function of rainfall variables. The results revealed that total event rainfall and average rainfall intensity are possible predictors of pollutant loads. Overall, the models are indicators of the high uncertainties of NPSs; perhaps estimation of EMCs and loads could be accurately obtained by means of water quality sampling or a long term monitoring is needed to gather more data that can be used for the development of estimation models.
文摘Consider a partially linear regression model with an unknown vector parameter , an unknown function g(·), and unknown heteroscedastic error variances. Chen, You<SUP>[23]</SUP> proposed a semiparametric generalized least squares estimator (SGLSE) for , which takes the heteroscedasticity into account to increase efficiency. For inference based on this SGLSE, it is necessary to construct a consistent estimator for its asymptotic covariance matrix. However, when there exists within-group correlation, the traditional delta method and the delete-1 jackknife estimation fail to offer such a consistent estimator. In this paper, by deleting grouped partial residuals a delete-group jackknife method is examined. It is shown that the delete-group jackknife method indeed can provide a consistent estimator for the asymptotic covariance matrix in the presence of within-group correlations. This result is an extension of that in [21].
文摘We consider the semiparametric partially linear regression models with mean function XTβ + g(z), where X and z are functional data. The new estimators of β and g(z) are presented and some asymptotic results are given. The strong convergence rates of the proposed estimators are obtained. In our estimation, the observation number of each subject will be completely flexible. Some simulation study is conducted to investigate the finite sample performance of the proposed estimators.
文摘Mathematical modeling of economic indices is a challenging topic in crop production systems.The present study aimed to model the economic indices of mechanized and semimechanized rainfed wheat production systems using various multiple linear regression models.The study area was Behshahr County located in the east of Mazandaran Province,Northern Iran.The statistical population included all wheat producers in Behshahr County in 2016/17 crop year.Five input variables were human labor,machinery,diesel fuel,chemical(chemical fertilizers and chemical pesticides)costs,and the income was considered to be the output.The results showed that the cost of wheat production in the semimechanized system was higher than that of the mechanized system.In both systems,the highest cost was related to agricultural machinery input.Moreover,seed cost was lower in the mechanized system than that of the semi-mechanized system.The net return indicator was 993.68$ha1 and 626.71$ha1 for the mechanized and semi-mechanized systems,respectively.The average benefit to cost ratio was 3.46 and 2.40 for the mechanized and semi-mechanized systems,respectively,demonstrating the greater profitability of the mechanized system.The results of the evaluation of five types of regression models including the Cobb-Douglas,linear,2FI,quadratic and pure-quadratic for the mechanized and semi-mechanized production systems indicated that in the developed Cobb-Douglas model,the R2-value was higher than that of the quadratic model while RMSE and MAPE of the quadratic model were determined to be smaller than that of the Cobb-Douglas model.Therefore,the best model to investigate the relationship between input costs and the income of wheat production in both mechanized and semi-mechanized systems was the quadratic model.
文摘In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard normal distribution. We get the EB estimators by using kernel estimation of multivariate density function and its first order partial derivatives. It is shown that the convergence rates of the EB estimators are under the condition where an integer k > 1 . is an arbitrary small number and m is the dimension of the vector Y.
基金supported by the Korea Ministry of Environment, as "The Eco-innovation Project" (No. 413111-003)
文摘Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive models and mathematical models have been developed. Predictive tools such as regression models have been widely used to predict stormwater discharge characteristics. Storm event characteristics, such as antecedent dry days (ADD), have been related to response variables, such as pollutant loads and concentrations. However it has been a controversial issue among many studies to consider ADD as an important variable in predicting stormwater discharge characteristics. In this study, we examined the accuracy of general linear regression models in predicting discharge characteristics of roadway runoff. A total of 17 storm events were monitored in two highway segments, located in Gwangju, Korea. Data from the monitoring were used to calibrate United States Environmental Protection Agency's Storm Water Management Model (SWMM). The calibrated SWMM was simulated for 55 storm events, and the results of total suspended solid (TSS) discharge loads and event mean concentrations (EMC) were extracted. From these data, linear regression models were developed. R2 and p-values of the regression of ADD for both TSS loads and EMCs were investigated. Results showed that pollutant loads were better predicted than pollutant EMC in the multiple regression models. Regression may not provide the true effect of site-specific characteristics, due to uncertainty in the data.
基金Supported by the National Natural Science Foundation of China (Grant Nos.60972150 10926197)the Scienceand Technology Innovation Foundation of Northwestern Polytechnical University (Grant No.2007KJ01033)
文摘The paper investigates the sequential observations’ variance change in linear regression model. The procedure is based on a detection function constructed by residual squares of CUSUM and a boundary function which is designed so that the test has a small probability of false alarm and asymptotic power one. Simulation results show our monitoring procedure performs well when variance change occurs shortly after the monitoring time. The method is still feasible for regression coefficients change or both variance and regression coefficients change problem.
基金supported in part by National Natural Science Foundation of China(11171112,11201190)Doctoral Fund of Ministry of Education of China(20130076110004)+1 种基金Program of Shanghai Subject Chief Scientist(14XD1401600)the 111 Project of China(B14019)
文摘The varying-coefficient partially linear regression model is proposed by combining nonparametric and varying-coefficient regression procedures. Wong, et al. (2008) proposed the model and gave its estimation by the local linear method. In this paper its inference is addressed. Based on these estimates, the generalized like- lihood ratio test is established. Under the null hypotheses the normalized test statistic follows a x2-distribution asymptotically, with the scale constant and the degrees of freedom being independent of the nuisance param- eters. This is the Wilks phenomenon. Furthermore its asymptotic power is also derived, which achieves the optimal rate of convergence for nonparametric hypotheses testing. A simulation and a real example are used to evaluate the performances of the testing procedures empirically.