The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to elimin...The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.展开更多
The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This not...The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This notwithstanding, regression analysis may aim at prediction. Consequently, this paper examines the performances of the Ordinary Least Square (OLS) estimator, Cochrane-Orcutt (COR) estimator, Maximum Likelihood (ML) estimator and the estimators based on Principal Component (PC) analysis in prediction of linear regression model under the joint violations of the assumption of non-stochastic regressors, independent regressors and error terms. With correlated stochastic normal variables as regressors and autocorrelated error terms, Monte-Carlo experiments were conducted and the study further identifies the best estimator that can be used for prediction purpose by adopting the goodness of fit statistics of the estimators. From the results, it is observed that the performances of COR at each level of correlation (multicollinearity) and that of ML, especially when the sample size is large, over the levels of autocorrelation have a convex-like pattern while that of OLS and PC are concave-like. Also, as the levels of multicollinearity increase, the estimators, except the PC estimators when multicollinearity is negative, rapidly perform better over the levels autocorrelation. The COR and ML estimators are generally best for prediction in the presence of multicollinearity and autocorrelated error terms. However, at low levels of autocorrelation, the OLS estimator is either best or competes consistently with the best estimator, while the PC estimator is either best or competes with the best when multicollinearity level is high(λ>0.8 or λ-0.49).展开更多
In this article,a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different coefficient functions is defined.First step,by the l...In this article,a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different coefficient functions is defined.First step,by the local linear technique and the averaged method,the initial estimates of the coefficient functions are given.Second step,based on the initial estimates,the efficient estimates of the coefficient functions are proposed by a one-step back-fitting procedure.The efficient estimators share the same asymptotic normalities as the local linear estimators for the functional-coefficient models with a single smoothing variable in different functions.Two simulated examples show that the procedure is effective.展开更多
In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived ...In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived from two quite different perspectives. Here, settling on the most commonly accepted definition of the MSPE as the expectation of the squared prediction error loss, we provide theoretical expressions for it, valid for any linear model (LM) fitter, be it under random or non random designs. Specializing these MSPE expressions for each of them, we are able to derive closed formulas of the MSPE for some of the most popular LM fitters: Ordinary Least Squares (OLS), with or without a full column rank design matrix;Ordinary and Generalized Ridge regression, the latter embedding smoothing splines fitting. For each of these LM fitters, we then deduce a computable estimate of the MSPE which turns out to coincide with Akaike’s FPE. Using a slight variation, we similarly get a class of MSPE estimates coinciding with the classical GCV formula for those same LM fitters.展开更多
The data on the coal production and consumption in Jilin Province for the last ten years were collected,and the Grey System GM( 1,1) model and unary linear regression model were applied to predict the coal consumption...The data on the coal production and consumption in Jilin Province for the last ten years were collected,and the Grey System GM( 1,1) model and unary linear regression model were applied to predict the coal consumption of Jilin Production in 2014 and 2015. Through calculation,the predictive value on the coal consumption of Jilin Province was attained,namely consumption of 2014 is 114. 84 × 106 t and of 2015 is 117. 98 ×106t,respectively. Analysis of error data indicated that the predicted accuracy of Grey System GM( 1,1) model on the coal consumption in Jilin Province improved 0. 21% in comparison to unary linear regression model.展开更多
Forest fires are natural disasters that can occur suddenly and can be very damaging,burning thousands of square kilometers.Prevention is better than suppression and prediction models of forest fire occurrence have dev...Forest fires are natural disasters that can occur suddenly and can be very damaging,burning thousands of square kilometers.Prevention is better than suppression and prediction models of forest fire occurrence have developed from the logistic regression model,the geographical weighted logistic regression model,the Lasso regression model,the random forest model,and the support vector machine model based on historical forest fire data from 2000 to 2019 in Jilin Province.The models,along with a distribution map are presented in this paper to provide a theoretical basis for forest fire management in this area.Existing studies show that the prediction accuracies of the two machine learning models are higher than those of the three generalized linear regression models.The accuracies of the random forest model,the support vector machine model,geographical weighted logistic regression model,the Lasso regression model,and logistic model were 88.7%,87.7%,86.0%,85.0%and 84.6%,respectively.Weather is the main factor affecting forest fires,while the impacts of topography factors,human and social-economic factors on fire occurrence were similar.展开更多
Based on modeling principle of GM(1,1)model and linear regression model,a combined prediction model is established to predict equipment fault by the fitting of two models.The new prediction model takes full advantag...Based on modeling principle of GM(1,1)model and linear regression model,a combined prediction model is established to predict equipment fault by the fitting of two models.The new prediction model takes full advantage of prediction information provided by the two models and improves the prediction precision.Finally,this model is introduced to predict the system fault time according to the output voltages of a certain type of radar transmitter.展开更多
In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to...In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to be estimated,random errorsε_(i)are(α,β)-mix_(i)ng random variables.The p-th(p>1)mean consistency,strong consistency and complete consistency for least squares estimators ofβ^(*)and g(·)are investigated under some mild conditions.In addition,a numerical simulation is carried out to study the finite sample performance of the theoretical results.Finally,a real data analysis is provided to further verify the effect of the model.展开更多
Evaluating the adaptability of cantilever boring machine(CBM) through in-depth excavation and analysis of tunnel excavation data and rock mass parameters is the premise of mechanical design and efficient excavation in...Evaluating the adaptability of cantilever boring machine(CBM) through in-depth excavation and analysis of tunnel excavation data and rock mass parameters is the premise of mechanical design and efficient excavation in the field of underground space engineering.This paper presented a case study of tunnelling performance prediction method of CBM in sedimentary hard-rock tunnel of Karst landform type by using tunneling data and surrounding rock parameters.The uniaxial compressive strength(UCS),rock integrity factor(Kv),basic quality index([BQ]),rock quality index RQD,brazilian tensile strength(BTS) and brittleness index(BI) were introduced to construct a performance prediction database based on the hard-rock tunnel of Guiyang Metro Line 1 and Line 3,and then established the performance prediction model of cantilever boring machine.Then the deep belief network(DBN) was introduced into the performance prediction model,and the reliability of performance prediction model was verified by combining with engineering data.The study showed that the influence degree of surrounding rock parameters on the tunneling performance of the cantilever boring machine is UCS > [BQ] > BTS >RQD > Kv > BI.The performance prediction model shows that the instantaneous cutting rate(ICR) has a good correlation with the surrounding rock parameters,and the predicting model accuracy is related to the reliability of construction data.The prediction of limestone and dolomite sections of Line 3 based on the DBN performance prediction model shows that the measured ICR and predicted ICR is consistent and the built performance prediction model is reliable.The research results have theoretical reference significance for the applicability analysis and mechanical selection of cantilever boring machine for hard rock tunnel.展开更多
Low visibility condition hinders both air traffic and road traffic operations. Accurate forecasting of visibility condition helps aircraft operators and travelers to make better decisions and improve their safety. It ...Low visibility condition hinders both air traffic and road traffic operations. Accurate forecasting of visibility condition helps aircraft operators and travelers to make better decisions and improve their safety. It is, therefore, essential to investigate and identify the predictor variables that could influence and help predict visibility. The objective of this study is to identify the predictor variables that influence visibility. Four years of surface weather observations, from January 2011 to December 2014, were collected from the weather stations located in and around the state of North Carolina, USA for the model development. Ordinary least squares (OLS) and weighted least squares (WLS) regression models were developed for different visibility and elevation ranges. The results indicate that elevation, cloud cover, and precipitation are negatively associated with the visibility in visibility less than 15,000 m model. The elevation, cloud cover and the presence of water bodies within the vicinity play an important role in the visibility less than 2000 m model. The chances of low visibility condition are higher between six to twelve hours after the rainfall when compared to the first six hours after the rainfall. The results from this study help to understand the influence of predictor variables that should be dealt with to improve the traffic operations and safety concerning the visibility near the airports/road transportation network.展开更多
Traditional methods for water table prediction have such defects as extensive calculation and reliance on the presupposition of a homogeneous and regular aquifer.Based on the fundamentals of the general regression neu...Traditional methods for water table prediction have such defects as extensive calculation and reliance on the presupposition of a homogeneous and regular aquifer.Based on the fundamentals of the general regression neural network(GRNN),this article sets up a GRNN model for water level prediction.Case study indicates that this model,even with limited information,has satisfactory prediction accuracy,which,coupled with a simple model structure and relatively high calculation efficiency,mean a vast application prospect for the model.展开更多
With the rapid development of the movie industry, it is vital to evaluate and predict a movie’s quality. In this paper, a movie score prediction model is proposed based on the movie plots. Movie data was processed wi...With the rapid development of the movie industry, it is vital to evaluate and predict a movie’s quality. In this paper, a movie score prediction model is proposed based on the movie plots. Movie data was processed with the word2vec method, and the linear regression model and back propagation neural network algorithm were employed to establish the movie score prediction model. The high-quality classic movie plots of high-scoring movies summed up by big data contributed to a high synthesis of the wonderful content of the film. Experimental results show that it is effective in terms of movie evaluation and prediction, and helpful in understanding people’s preferences for movie plots.展开更多
目的科学评价芙蓉李果实成熟期间的营养品质,建立色度值表观特征与营养品质的关系。方法以福建省主栽品种芙蓉李为研究对象,对其成熟期间果糖、葡萄糖、蔗糖、苹果酸、奎尼酸、琥珀酸、柠檬酸、富马酸、矢车菊素-3-芸香糖苷、矢车菊素-3...目的科学评价芙蓉李果实成熟期间的营养品质,建立色度值表观特征与营养品质的关系。方法以福建省主栽品种芙蓉李为研究对象,对其成熟期间果糖、葡萄糖、蔗糖、苹果酸、奎尼酸、琥珀酸、柠檬酸、富马酸、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷、多酚、黄酮、类胡萝卜素等13个品质指标进行分析和综合评价。结果芙蓉李成熟期间,各品质指标的含量变化存在显著差异(P<0.05),综合运用相关分析、因子分析、绝对因子分析-多元线性回归(absolute principal component scores-multiple linear regression,APCS-MLR)分析筛选可反映芙蓉李综合品质的主要指标。因子分析提取出3个主因子,贡献率分别为52.677%、23.468%、11.649%,累计贡献率为87.794%。综合APCS-MLR等数理统计分析,主因子1主要对果糖、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷贡献较大,贡献率分别为53.00%、73.85%、55.54%;主因子2主要对蔗糖、富马酸、果糖、柠檬酸的贡献率较大,分别为28.26%、18.70%、16.14%、15.59%;主因子3主要对多酚(29.13%)和黄酮(28.28%)有较大贡献率;选取3个主因子总贡献率高于60%的果糖、葡萄糖、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷作为综合品质评价的主要指标。分别对已筛选出的4个主要评价指标与色度值进行多元线性逐步回归分析,建立4个主要指标与色度值的表观预测模型,各模型均具有较好的拟合度,预测值与实测值的均方根误差较小;进一步验证结果表明,通过色度值对4个指标的预测具有较高的可靠性和准确性。结论本研究筛选出的主要指标及预测模型可更加简单、便捷地评价芙蓉李果实成熟期间的综合品质。展开更多
基金supported by the National Natural Science Foundation of China(71071077)the Ministry of Education Key Project of National Educational Science Planning(DFA090215)+1 种基金China Postdoctoral Science Foundation(20100481137)Funding of Jiangsu Innovation Program for Graduate Education(CXZZ11-0226)
文摘The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.
文摘The development of many estimators of parameters of linear regression model is traceable to non-validity of the assumptions under which the model is formulated, especially when applied to real life situation. This notwithstanding, regression analysis may aim at prediction. Consequently, this paper examines the performances of the Ordinary Least Square (OLS) estimator, Cochrane-Orcutt (COR) estimator, Maximum Likelihood (ML) estimator and the estimators based on Principal Component (PC) analysis in prediction of linear regression model under the joint violations of the assumption of non-stochastic regressors, independent regressors and error terms. With correlated stochastic normal variables as regressors and autocorrelated error terms, Monte-Carlo experiments were conducted and the study further identifies the best estimator that can be used for prediction purpose by adopting the goodness of fit statistics of the estimators. From the results, it is observed that the performances of COR at each level of correlation (multicollinearity) and that of ML, especially when the sample size is large, over the levels of autocorrelation have a convex-like pattern while that of OLS and PC are concave-like. Also, as the levels of multicollinearity increase, the estimators, except the PC estimators when multicollinearity is negative, rapidly perform better over the levels autocorrelation. The COR and ML estimators are generally best for prediction in the presence of multicollinearity and autocorrelated error terms. However, at low levels of autocorrelation, the OLS estimator is either best or competes consistently with the best estimator, while the PC estimator is either best or competes with the best when multicollinearity level is high(λ>0.8 or λ-0.49).
文摘In this article,a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different coefficient functions is defined.First step,by the local linear technique and the averaged method,the initial estimates of the coefficient functions are given.Second step,based on the initial estimates,the efficient estimates of the coefficient functions are proposed by a one-step back-fitting procedure.The efficient estimators share the same asymptotic normalities as the local linear estimators for the functional-coefficient models with a single smoothing variable in different functions.Two simulated examples show that the procedure is effective.
文摘In regression, despite being both aimed at estimating the Mean Squared Prediction Error (MSPE), Akaike’s Final Prediction Error (FPE) and the Generalized Cross Validation (GCV) selection criteria are usually derived from two quite different perspectives. Here, settling on the most commonly accepted definition of the MSPE as the expectation of the squared prediction error loss, we provide theoretical expressions for it, valid for any linear model (LM) fitter, be it under random or non random designs. Specializing these MSPE expressions for each of them, we are able to derive closed formulas of the MSPE for some of the most popular LM fitters: Ordinary Least Squares (OLS), with or without a full column rank design matrix;Ordinary and Generalized Ridge regression, the latter embedding smoothing splines fitting. For each of these LM fitters, we then deduce a computable estimate of the MSPE which turns out to coincide with Akaike’s FPE. Using a slight variation, we similarly get a class of MSPE estimates coinciding with the classical GCV formula for those same LM fitters.
基金Supported by project of National Natural Science Foundation of China(No.41272360)
文摘The data on the coal production and consumption in Jilin Province for the last ten years were collected,and the Grey System GM( 1,1) model and unary linear regression model were applied to predict the coal consumption of Jilin Production in 2014 and 2015. Through calculation,the predictive value on the coal consumption of Jilin Province was attained,namely consumption of 2014 is 114. 84 × 106 t and of 2015 is 117. 98 ×106t,respectively. Analysis of error data indicated that the predicted accuracy of Grey System GM( 1,1) model on the coal consumption in Jilin Province improved 0. 21% in comparison to unary linear regression model.
基金This research was funded by the National Natural Science Foundation of China(grant no.32271881).
文摘Forest fires are natural disasters that can occur suddenly and can be very damaging,burning thousands of square kilometers.Prevention is better than suppression and prediction models of forest fire occurrence have developed from the logistic regression model,the geographical weighted logistic regression model,the Lasso regression model,the random forest model,and the support vector machine model based on historical forest fire data from 2000 to 2019 in Jilin Province.The models,along with a distribution map are presented in this paper to provide a theoretical basis for forest fire management in this area.Existing studies show that the prediction accuracies of the two machine learning models are higher than those of the three generalized linear regression models.The accuracies of the random forest model,the support vector machine model,geographical weighted logistic regression model,the Lasso regression model,and logistic model were 88.7%,87.7%,86.0%,85.0%and 84.6%,respectively.Weather is the main factor affecting forest fires,while the impacts of topography factors,human and social-economic factors on fire occurrence were similar.
基金National Natural Science Foundation of China(No.51175480)
文摘Based on modeling principle of GM(1,1)model and linear regression model,a combined prediction model is established to predict equipment fault by the fitting of two models.The new prediction model takes full advantage of prediction information provided by the two models and improves the prediction precision.Finally,this model is introduced to predict the system fault time according to the output voltages of a certain type of radar transmitter.
基金Supported by the National Social Science Foundation of China(Grant No.22BTJ059)。
文摘In this paper,we consider the partial linear regression model y_(i)=x_(i)β^(*)+g(ti)+ε_(i),i=1,2,...,n,where(x_(i),ti)are known fixed design points,g(·)is an unknown function,andβ^(*)is an unknown parameter to be estimated,random errorsε_(i)are(α,β)-mix_(i)ng random variables.The p-th(p>1)mean consistency,strong consistency and complete consistency for least squares estimators ofβ^(*)and g(·)are investigated under some mild conditions.In addition,a numerical simulation is carried out to study the finite sample performance of the theoretical results.Finally,a real data analysis is provided to further verify the effect of the model.
基金National Natural Science Foundation of China (Grant No.52178393)the Science and Technology Innovation Team of Shaanxi Innovation Capability Support Plan (Grant No.2020TD005)Science and Technology Innovation Project of China Railway Construction Bridge Engineering Bureau Group Co.,Ltd.(Grant No.DQJ-2020-B07)。
文摘Evaluating the adaptability of cantilever boring machine(CBM) through in-depth excavation and analysis of tunnel excavation data and rock mass parameters is the premise of mechanical design and efficient excavation in the field of underground space engineering.This paper presented a case study of tunnelling performance prediction method of CBM in sedimentary hard-rock tunnel of Karst landform type by using tunneling data and surrounding rock parameters.The uniaxial compressive strength(UCS),rock integrity factor(Kv),basic quality index([BQ]),rock quality index RQD,brazilian tensile strength(BTS) and brittleness index(BI) were introduced to construct a performance prediction database based on the hard-rock tunnel of Guiyang Metro Line 1 and Line 3,and then established the performance prediction model of cantilever boring machine.Then the deep belief network(DBN) was introduced into the performance prediction model,and the reliability of performance prediction model was verified by combining with engineering data.The study showed that the influence degree of surrounding rock parameters on the tunneling performance of the cantilever boring machine is UCS > [BQ] > BTS >RQD > Kv > BI.The performance prediction model shows that the instantaneous cutting rate(ICR) has a good correlation with the surrounding rock parameters,and the predicting model accuracy is related to the reliability of construction data.The prediction of limestone and dolomite sections of Line 3 based on the DBN performance prediction model shows that the measured ICR and predicted ICR is consistent and the built performance prediction model is reliable.The research results have theoretical reference significance for the applicability analysis and mechanical selection of cantilever boring machine for hard rock tunnel.
文摘Low visibility condition hinders both air traffic and road traffic operations. Accurate forecasting of visibility condition helps aircraft operators and travelers to make better decisions and improve their safety. It is, therefore, essential to investigate and identify the predictor variables that could influence and help predict visibility. The objective of this study is to identify the predictor variables that influence visibility. Four years of surface weather observations, from January 2011 to December 2014, were collected from the weather stations located in and around the state of North Carolina, USA for the model development. Ordinary least squares (OLS) and weighted least squares (WLS) regression models were developed for different visibility and elevation ranges. The results indicate that elevation, cloud cover, and precipitation are negatively associated with the visibility in visibility less than 15,000 m model. The elevation, cloud cover and the presence of water bodies within the vicinity play an important role in the visibility less than 2000 m model. The chances of low visibility condition are higher between six to twelve hours after the rainfall when compared to the first six hours after the rainfall. The results from this study help to understand the influence of predictor variables that should be dealt with to improve the traffic operations and safety concerning the visibility near the airports/road transportation network.
文摘Traditional methods for water table prediction have such defects as extensive calculation and reliance on the presupposition of a homogeneous and regular aquifer.Based on the fundamentals of the general regression neural network(GRNN),this article sets up a GRNN model for water level prediction.Case study indicates that this model,even with limited information,has satisfactory prediction accuracy,which,coupled with a simple model structure and relatively high calculation efficiency,mean a vast application prospect for the model.
基金Natural Science Foundation of Jilin Provincial Science and Technology Department (20180101016JC)Science and Technology Development Plan of Jilin Province (20180101054JC).
文摘With the rapid development of the movie industry, it is vital to evaluate and predict a movie’s quality. In this paper, a movie score prediction model is proposed based on the movie plots. Movie data was processed with the word2vec method, and the linear regression model and back propagation neural network algorithm were employed to establish the movie score prediction model. The high-quality classic movie plots of high-scoring movies summed up by big data contributed to a high synthesis of the wonderful content of the film. Experimental results show that it is effective in terms of movie evaluation and prediction, and helpful in understanding people’s preferences for movie plots.
文摘目的科学评价芙蓉李果实成熟期间的营养品质,建立色度值表观特征与营养品质的关系。方法以福建省主栽品种芙蓉李为研究对象,对其成熟期间果糖、葡萄糖、蔗糖、苹果酸、奎尼酸、琥珀酸、柠檬酸、富马酸、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷、多酚、黄酮、类胡萝卜素等13个品质指标进行分析和综合评价。结果芙蓉李成熟期间,各品质指标的含量变化存在显著差异(P<0.05),综合运用相关分析、因子分析、绝对因子分析-多元线性回归(absolute principal component scores-multiple linear regression,APCS-MLR)分析筛选可反映芙蓉李综合品质的主要指标。因子分析提取出3个主因子,贡献率分别为52.677%、23.468%、11.649%,累计贡献率为87.794%。综合APCS-MLR等数理统计分析,主因子1主要对果糖、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷贡献较大,贡献率分别为53.00%、73.85%、55.54%;主因子2主要对蔗糖、富马酸、果糖、柠檬酸的贡献率较大,分别为28.26%、18.70%、16.14%、15.59%;主因子3主要对多酚(29.13%)和黄酮(28.28%)有较大贡献率;选取3个主因子总贡献率高于60%的果糖、葡萄糖、矢车菊素-3-芸香糖苷、矢车菊素-3-葡萄糖苷作为综合品质评价的主要指标。分别对已筛选出的4个主要评价指标与色度值进行多元线性逐步回归分析,建立4个主要指标与色度值的表观预测模型,各模型均具有较好的拟合度,预测值与实测值的均方根误差较小;进一步验证结果表明,通过色度值对4个指标的预测具有较高的可靠性和准确性。结论本研究筛选出的主要指标及预测模型可更加简单、便捷地评价芙蓉李果实成熟期间的综合品质。