Because of the difficulty to obtain the traffic flow information of lanes at non-detector intersections in most metropolises of the world,based on the relationships between the lanes of signal-controlled intersections...Because of the difficulty to obtain the traffic flow information of lanes at non-detector intersections in most metropolises of the world,based on the relationships between the lanes of signal-controlled intersections,cluster analysis and stepwise regression are integrated to predict the traffic volume of lanes at non-detector isolated controlled intersections.First cluster analysis is used to cluster the lanes of non-detector isolated signal-controlled intersections and the lanes of all signal-controlled intersections with detectors.Then, by the results of cluster analysis,the traffic volume samples are selected randomly and stepwise regression is used to predict the traffic volume of lanes at non-detector isolated signal-controlled intersections.The method is tested by the traffic volume data of lanes of the road network of Nanjing city.The problem of predicting the traffic volume of lanes at non-detector isolated signal-controlled intersections was resolved and can be widely used in urban traffic flow guidance and urban traffic control in cities without enough intersections equipped with detectors.展开更多
[Objective] The research aimed to study the significant influence factors of the population variations of oriental fruit fly. [Method] Using stepwise regression analysis, the population variations law of oriental frui...[Objective] The research aimed to study the significant influence factors of the population variations of oriental fruit fly. [Method] Using stepwise regression analysis, the population variations law of oriental fruit fly in Jianshui County of Yunnan province and the meteorological factors that caused its occurrence were analyzed. And the regression model was built. Finally, the regression model was tested on the basis of the data in Jianshui County of Yunnan Province during 2004-2006.[Result] The main meteorological factors that influenced the occurrence of oriental fruit fly were relative humidity, the lowest monthly temperature and rainfall. [Conclusion] This study will provide certain reference for the prediction researches on the time, quantity and occurrence peak of oriental fruit fly.展开更多
Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was foun...Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was found that the molar refractivity of the C3′substituent of the C13 side chain has significant correlation with its activity. We deduce that structural changes in the C3′substituents may be critical to the anticancer function. It would be useful to the design and synthesis of taxol like compounds with improved activities.展开更多
Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of hea...Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respec-tively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demon-strates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level.展开更多
Aim New statistical method was applied in data analysis of orthogonal experiments to optimize the preparation of liposome. Method Particle size, zeta potential, encapsulation efficiency and physical stability of lipos...Aim New statistical method was applied in data analysis of orthogonal experiments to optimize the preparation of liposome. Method Particle size, zeta potential, encapsulation efficiency and physical stability of liposomes were selected by orthogonal design as evaluating indicators. Through three statistical methods (direct observation, variance analysis and stepwise multiple regression), the optimized preparing conditions were acquired and validated by experiment. Results All of the four indicators were different by these analyses. The validation experiments indicated that the optimized conditions by stepwise multiple regressions were better than that by traditional analysis. Conclusion Experiment results suggested that multiple regressions could avoid the weakness of direct observation and variance analysis, but more work should be done in preparing liposomes.展开更多
A statistical analysis was conducted on the feeding behavior of 106 York breeding pigs.Pearson correlation analysis,principal component correlation analysis and multiple stepwise regression equation methods were appli...A statistical analysis was conducted on the feeding behavior of 106 York breeding pigs.Pearson correlation analysis,principal component correlation analysis and multiple stepwise regression equation methods were applied to establish regression equations of the York breeding pigs total feed intake per time and average feed intake per time with corrected fat thickness,feed conversion rate,and corrected daily gain.The results showed that:①there were three peak feed intake periods for the pigs,and the correlation coefficient between the feed intake and the corrected fat thickness of the pigs in the 24 h period was positive or negative,that is,increasing the number of feeding times and the feed intake was not necessarily conducive to the fat thickness accumulation,but the breeding goal of fat thickness could be achieved by controlling the feeding times and feed intake;②the average feed intake of pigs in the 60-90 kg body weight stage was 30%-50%higher than that of the 30-60 kg body weight stage,but the number of feeding times decreased,the peak feeding time was more concentrated,and the feeding duration per time was 3.0 min longer,indicating that as the weight of pigs increased,the feed intake increased significantly;and③the stepwise regression equations and the principal component equations showed that the feeding behavior of York pigs in the 30-90 kg growth stage was not only affected by the feeding time within 24 h,but also by environmental factors such as temperature and humidity.The feeding behavior of York pigs is a complex process of interaction between environmental factors and animal factors.展开更多
This paper has compared variable selection method for multiple linear regression models that have both relative and non-relative variables in full model when predictor variables are highly correlated 0.999 . In this s...This paper has compared variable selection method for multiple linear regression models that have both relative and non-relative variables in full model when predictor variables are highly correlated 0.999 . In this study two objective functions used in the Tabu Search are mean square error (MSE) and the mean absolute error (MAE). The results of Tabu Search are compared with the results obtained by stepwise regression method based on the hit percentage criterion. The simulations cover the both cases, without and with multicollinearity problems. For each situation, 1,000 iterations are examined by applying a different sample size n = 25 and 100 at 0.05 level of significance. Without multicollinearity problem, the hit percentages of the stepwise regression method and Tabu Search using the objective function of MSE are almost the same but slightly higher than the Tabu Search using the objective function of MAE. However with multicollinearity problem the hit percentages of the Tabu Search using both objective functions are higher than the hit percentage of the stepwise regression method.展开更多
In this paper, an overview of an important feature in statistics field has shown: the stepwise multiple linear regression. Likewise, a link between stepwise multiple linear regression and earthquakes localization has...In this paper, an overview of an important feature in statistics field has shown: the stepwise multiple linear regression. Likewise, a link between stepwise multiple linear regression and earthquakes localization has been descripted. Precisely, the aim of this research is showing how stepwise multiple linear regression contributes to solution of earthquakes localization, describing its conditions of use in HYPO71PC, a software devoted to computation of seismic sources’ collocation. This aim is reached treating a concrete case, that is computation of earthquakes localization happening on Mount Vesuvius, Italy.展开更多
The total output value of mutton in Northwestern China has accounted for more than 60%of the total output value of animal husbandry over the years.It can be seen that the mutton industry in Northwest China not only pl...The total output value of mutton in Northwestern China has accounted for more than 60%of the total output value of animal husbandry over the years.It can be seen that the mutton industry in Northwest China not only plays a pivotal role in animal husbandry,but also plays an important role in Chinese agriculture.In this study,based on cost accounting theory,income-related theories and total factor productivity theory,using basic knowledge of statistics and economics,drawing on existing research results at home and abroad,and adopting a combination of qualitative analysis and quantitative analysis of SAS multiple stepwise regression,the changing trends of cost-benefit of mutton sheep breeding in Northwest agricultural and pastoral areas and influencing factors of production costs and production efficiency were investigated,aiming to provide reference for saving mutton sheep feeding material resources,reducing mutton sheep breeding costs,and improving mutton sheep breeding benefits.展开更多
With the expansion of the gene expression profile database,in the case of as little as possible to lose information or to retain the most critical information,gene extraction has become a main direction for the schola...With the expansion of the gene expression profile database,in the case of as little as possible to lose information or to retain the most critical information,gene extraction has become a main direction for the scholars.This paper excludes 1561 irrelevant genes through the definition of weighted distance firstly,and then removes 252 redundant genes by Pearson's correlation coefficient.Finally by comparing the two methods,stepwise regression after clustering and only stepwise analysis,we obtain the best combination of 8 genes.展开更多
Suppression effect in multiple regression analysis may be more common in research than what is currently recognized. We have reviewed several literatures of interest which treats the concept and types of suppressor va...Suppression effect in multiple regression analysis may be more common in research than what is currently recognized. We have reviewed several literatures of interest which treats the concept and types of suppressor variables. Also, we have highlighted systematic ways to identify suppression effect in multiple regressions using statistics such as: R2, sum of squares, regression weight and comparing zero-order correlations with Variance Inflation Factor (VIF) respectively. We also establish that suppression effect is a function of multicollinearity;however, a suppressor variable should only be allowed in a regression analysis if its VIF is less than five (5).展开更多
There are various analytical, empirical and numerical methods to calculate groundwater inflow into tun- nels excavated in rocky media. Analytical methods have been widely applied in prediction of groundwa- ter inflow ...There are various analytical, empirical and numerical methods to calculate groundwater inflow into tun- nels excavated in rocky media. Analytical methods have been widely applied in prediction of groundwa- ter inflow to tunnels due to their simplicity and practical base theory. Investigations show that the real amount of water infiltrating into jointed tunnels is much less than calculated amount using analytical methods and obtained results are very dependent on tunnel's geometry and environmental situations. In this study, using multiple regression analysis, a new empirical model for estimation of groundwater seepage into circular tunnels was introduced. Our data was acquired from field surveys and laboratory analysis of core samples. New regression variables were defined after perusing single and two variables relationship between groundwater seepage and other variables. Finally, an appropriate model for estima- tion of leakage was obtained using the stepwise algorithm. Statistics like R, R2, R2e and the histogram of residual values in the model represent a good reputation and fitness for this model to estimate the groundwater seepage into tunnels. The new experimental model was used for the test data and results were satisfactory. Therefore, multiple regression analysis is an effective and efficient way to estimate the groundwater seeoage into tunnels.展开更多
Understanding the spatial-temporal dynamics of crop nitrogen(N)use efficiency(NUE)and the relationship with explanatory environmental variables can support land-use management and policymaking.Nevertheless,the applica...Understanding the spatial-temporal dynamics of crop nitrogen(N)use efficiency(NUE)and the relationship with explanatory environmental variables can support land-use management and policymaking.Nevertheless,the application of statistical models for evaluating the explanatory variables of space-time variation in crop NUE is still under-researched.In this study,stepwise multiple linear regression(SMLR)and Random Forest(RF)were used to evaluate the spatial and temporal variation of NUE indicators(i.e.,partial factor productivity of N(PFPN);partial nutrient balance of N(PNBN))at county scale in Northeast China(Heilongjiang,Liaoning and Jilin provinces)from 1990 to 2015.Explanatory variables included agricultural management practices,topography,climate,economy,soil and crop types.Results revealed that the PFPN was higher in the northern parts and lower in the center of the Northeast China and PNBN increased from southern to northern parts during the 1990–2015 period.The NUE indicators decreased with time in most counties during the study period.The model efficiency coefficients of the SMLR and RF models were 0.44 and 0.84 for PFPN,and 0.67 and 0.89 for PNBN,respectively.The RF model had higher relative importance of soil and climatic covariates and lower relative importance of crop covariates compared to the SMLR model.The planting area index of vegetables and beans,soil clay content,saturated water content,enhanced vegetation index in November&December,soil bulk density,and annual minimum temperature were the main explanatory variables for both NUE indicators.This is the first study to show the quantitative relative importance of explanatory variables for NUE at a county level in Northeast China using RF and SMLR.This novel study gives reference measurements to improve crop NUE which is one of the most effective means of managing N for sustainable development,ensuring food security,alleviating environmental degradation and increasing farmer’s profitability.展开更多
Objectives: The objectives of this study are to use CART (Classification and regression tree) and step-wise regression to 1) define the predictors of quality of life in ACS (acute coronary syndrome) patients, using de...Objectives: The objectives of this study are to use CART (Classification and regression tree) and step-wise regression to 1) define the predictors of quality of life in ACS (acute coronary syndrome) patients, using demographics, ACS symptoms, and anxiety as independent variables;and 2) discuss and compare the results of these two statistical approaches. Back- ground: In outcome studies of ACS, CART is a good alternative approach to linear regression;however, CART is rarely used. Methods: A descriptive survey design was used with 100 samples recruited. Result and Conclusions: Anxiety is the most significant predictor and also a stronger predictor than symptoms of ACS for the quality of life. The anxiety level patients experienced at the time heart attack occurred can be used to predict quality of life a month later. Furthermore, the majority of ACS patients experienced a moderate to high level of anxiety during a heart attack.展开更多
In this paper, downscaling models are developed using various linear regression approaches namely direct, forward, backward and stepwise regression for downscaling of GCM output to predict mean monthly precipitation u...In this paper, downscaling models are developed using various linear regression approaches namely direct, forward, backward and stepwise regression for downscaling of GCM output to predict mean monthly precipitation under IPCC SRES scenarios to watershed-basin scale in an arid region in India. The effectiveness of these regression approaches is evaluated through application to downscale the predictand for the Pichola lake region in Rajasthan state in India, which is considered to be a climatically sensitive region. The predictor variables are extracted from (1) the National Centers for Environmental Prediction (NCEP) reanalysis dataset for the period 1948–2000, and (2) the simulations from the third-generation Canadian Coupled Global Climate Model (CGCM3) for emission scenarios A1B, A2, B1 and COMMIT for the period 2001–2100. The selection of important predictor variables becomes a crucial issue for developing downscaling models since reanalysis data are based on wide range of meteorological measurements and observations. Direct regression was found to yield better performance among all other regression techniques explored in the present study. The results of downscaling models using both approaches show that precipitation is likely to increase in future for A1B, A2 and B1 scenarios, whereas no trend is discerned with the COMMIT.展开更多
Objective This study was undertaken to investigate the influencing factors on serum ALT level and hepatitis C virus(HCV)RNA titer in chronic hepatitis C(CHC)patients.Methods All patients enrolled into this study were ...Objective This study was undertaken to investigate the influencing factors on serum ALT level and hepatitis C virus(HCV)RNA titer in chronic hepatitis C(CHC)patients.Methods All patients enrolled into this study were anti-HCV positive.Retrospective tracing method was applied to detect serum ALT level and HCV RNA titer and to collect general information of the patients such as genders,age groups,interferon medication history,infection pathways,height and weight.Then the multi-factor analysis was adopted with the application of binominal logistic regression mode.Results The abnormal rate of ALT level was positively correlated to HCV RNA and gender while negatively correlated to interferon medication history and age group,with Wald value of the 4 factors as 39.604,11.823,18.991 and 7.389,respectively.The positive rate of HCV RNA was negatively correlated to interferon medication history and gender while positively correlated to ALT level,with corresponding Wald value of the 3 factors as81.394,7.618 and 27.562,respectively.Conclusions The normal ALT level in HCV infected patients was associated with viral load,age,gender and interferon medication history,while the normal rate of HCV RNA titer was closely associated with gender,interferon medication history and ALT level.展开更多
目的:探讨机器学习模型与逐步线性回归(Stepwise linear regression,SLR)模型在亚急性期脑卒中患者康复后功能结局预测中的价值。方法:选取中国人民解放军联勤保障部队第九四五医院2013年1月~2023年12月收治的亚急性期脑卒中患者1046例...目的:探讨机器学习模型与逐步线性回归(Stepwise linear regression,SLR)模型在亚急性期脑卒中患者康复后功能结局预测中的价值。方法:选取中国人民解放军联勤保障部队第九四五医院2013年1月~2023年12月收治的亚急性期脑卒中患者1046例为研究对象,取患者一般资料以及入院时功能独立性量表(Functional Independence Measure,FIM)评分构建SLR、回归树(Regression trees.RT)、集成学习(Ensemble learning,EL)、人工神经网络(Artificial neural network,ANN)、支持向量回归(Support vector regression,SVR)以及高斯过程回归(Gaussian process regression,GPR)预测模型,并采用10折交叉验证,比较各模型实际与预测出院FIM评分以及FIM增益的决定系数(R^(2))、均方根误差(Root Mean Squared Error,RMSE)。结果:机器学习模型(R^(2):RT=0.75,EL=0.78,ANN=0.81,SVR=0.80,GPR=0.81)在预测FIM运动评分方面优于SLR(0.70)。机器学习模型对FIM增益总分的预测准确性(R^(2):RT=0.48,EL=0.51,ANN=0.50,SVR=0.51,GPR=0.54)也优于SLR(0.22)。结论:机器学习模型在预测FIM预后方面优于SLR:仅包含患者一般信息和入院FIM评分的机器学习模型的预测准确性优于既往研究,同时GPR对FIM预后的预测准确性最高。展开更多
Partial least squares (PLS) regression was applied to the Lunar Soft Characterization Consortium (LSCC) dataset for spectral estimation of TiO2. The LSCC dataset was split into a number of subsets including the lo...Partial least squares (PLS) regression was applied to the Lunar Soft Characterization Consortium (LSCC) dataset for spectral estimation of TiO2. The LSCC dataset was split into a number of subsets including the low-Ti, high-Ti, total mare soils, total highland, Apollo 16, and Apollo 14 soils to investigate the effects of interfering minerals and nonlinearity on the PLS performance. The PLS weight loading vectors were analyzed through stepwise multiple regression analysis (SMRA) to identify mineral species driving and interfering the PLS performance. PLS exhibits high performance for estimating TiO2 for the LSCC low-Ti and high-Ti mare samples and both groups analyzed together. The results suggest that while the dominant TiO2-bearing minerals are few, additional PLS factors are required to compensate the effects on the important PLS factors of minerals that are not highly corrected to TiO2, to accommodate nonlinear relationships between reflectance and TiO2, and to correct inconsistent mineral-TiO2 correlations between the high-Ti and iow-Ti mare samples. Analysis of the LSCC highland soil samples indicates that the Apollo 16 soils are responsible for the large errors of TiO2 estimates when the soils are modeled with other subgroups. For the LSCC Apollo 16 samples, the dominant spectral effects of plagioclase over other dark minerals are primarily responsible for large errors of estimated TiO2. For the Apollo 14 soils, more accurate estimation for TiO2 is attributed to the posi- tive correlation between a major TiOe-bearing component and TiO2, explaining why the Apollo 14 soils follow the regression trend when analyzed with other soils groups.展开更多
The prediction accuracy of the traditional stepwise regression prediction equation(SRPE)is affected by the multicollinearity among its predictors.This paper introduces the condition number analysis into the predicti...The prediction accuracy of the traditional stepwise regression prediction equation(SRPE)is affected by the multicollinearity among its predictors.This paper introduces the condition number analysis into the prediction modeling to minimize the multicollinearity in the SRPE.In the condition number prediction modeling,the condition number is used to select the combination of predictors with the lowest multicollinearity from the possible combinations of a number of candidate predictors(variables),and the selected combination is then used to construct the condition number regression prediction equation(CNRPE).This novel prediction modeling is performed in typhoon track prediction,which is a difficult task among meteorological disaster predictions.Six pairs of typhoon track latitude/longitude SRPEs and CNRPEs for July,August,and September are built by employing the traditional and the novel prediction modeling approaches,respectively,and by using a large number of identical modeling samples.The comparative analysis indicates that under the condition of the same candidate predictors(variables)and predictands(dependent variables),although the fitting accuracy of the novel prediction models used for the historical samples of South China Sea(SCS)typhoon tracks is slightly lower than that of the traditional prediction models,the prediction accuracy for the independent samples is obviously improved,with the averaged prediction error of the novel models for July,August,and September being 153.9 kin,which is 75.3 km smaller than that of the traditional models(a reduction of 33%).This is because the novel prediction modeling effectively minimizes the multicollinearity by computation and analysis of the condition number.It is shown further that when F=1.0,2.0,and 3.0,the average prediction errors of the traditional SRPEs are obviously larger than those of the CNRPEs.Moreover,extremely large and unreasonable prediction errors occur at some individual points of the typhoon track predicted by the SRPEs due to the multicollinearity existing in the combination of predictors.展开更多
基金The National Natural Science Foundation of China(No.50378016).
文摘Because of the difficulty to obtain the traffic flow information of lanes at non-detector intersections in most metropolises of the world,based on the relationships between the lanes of signal-controlled intersections,cluster analysis and stepwise regression are integrated to predict the traffic volume of lanes at non-detector isolated controlled intersections.First cluster analysis is used to cluster the lanes of non-detector isolated signal-controlled intersections and the lanes of all signal-controlled intersections with detectors.Then, by the results of cluster analysis,the traffic volume samples are selected randomly and stepwise regression is used to predict the traffic volume of lanes at non-detector isolated signal-controlled intersections.The method is tested by the traffic volume data of lanes of the road network of Nanjing city.The problem of predicting the traffic volume of lanes at non-detector isolated signal-controlled intersections was resolved and can be widely used in urban traffic flow guidance and urban traffic control in cities without enough intersections equipped with detectors.
基金Supported by National Key Technology R&D Program in the11th Five Year Plan of China(2006BAD10A14)~~
文摘[Objective] The research aimed to study the significant influence factors of the population variations of oriental fruit fly. [Method] Using stepwise regression analysis, the population variations law of oriental fruit fly in Jianshui County of Yunnan province and the meteorological factors that caused its occurrence were analyzed. And the regression model was built. Finally, the regression model was tested on the basis of the data in Jianshui County of Yunnan Province during 2004-2006.[Result] The main meteorological factors that influenced the occurrence of oriental fruit fly were relative humidity, the lowest monthly temperature and rainfall. [Conclusion] This study will provide certain reference for the prediction researches on the time, quantity and occurrence peak of oriental fruit fly.
文摘Abstract Using the method of stepwise multivariate linear regression (SMLR), the quantitative structure activity relationships (QSAR) of two isomeric series of taxol and its derivatives have been studied. It was found that the molar refractivity of the C3′substituent of the C13 side chain has significant correlation with its activity. We deduce that structural changes in the C3′substituents may be critical to the anticancer function. It would be useful to the design and synthesis of taxol like compounds with improved activities.
基金the Hi-Tech Research and Development Program (863) of China (No. 2006AA10Z203)the National Scienceand Technology Task Force Project (No. 2006BAD10A01), China
文摘Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respec-tively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demon-strates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level.
文摘Aim New statistical method was applied in data analysis of orthogonal experiments to optimize the preparation of liposome. Method Particle size, zeta potential, encapsulation efficiency and physical stability of liposomes were selected by orthogonal design as evaluating indicators. Through three statistical methods (direct observation, variance analysis and stepwise multiple regression), the optimized preparing conditions were acquired and validated by experiment. Results All of the four indicators were different by these analyses. The validation experiments indicated that the optimized conditions by stepwise multiple regressions were better than that by traditional analysis. Conclusion Experiment results suggested that multiple regressions could avoid the weakness of direct observation and variance analysis, but more work should be done in preparing liposomes.
文摘A statistical analysis was conducted on the feeding behavior of 106 York breeding pigs.Pearson correlation analysis,principal component correlation analysis and multiple stepwise regression equation methods were applied to establish regression equations of the York breeding pigs total feed intake per time and average feed intake per time with corrected fat thickness,feed conversion rate,and corrected daily gain.The results showed that:①there were three peak feed intake periods for the pigs,and the correlation coefficient between the feed intake and the corrected fat thickness of the pigs in the 24 h period was positive or negative,that is,increasing the number of feeding times and the feed intake was not necessarily conducive to the fat thickness accumulation,but the breeding goal of fat thickness could be achieved by controlling the feeding times and feed intake;②the average feed intake of pigs in the 60-90 kg body weight stage was 30%-50%higher than that of the 30-60 kg body weight stage,but the number of feeding times decreased,the peak feeding time was more concentrated,and the feeding duration per time was 3.0 min longer,indicating that as the weight of pigs increased,the feed intake increased significantly;and③the stepwise regression equations and the principal component equations showed that the feeding behavior of York pigs in the 30-90 kg growth stage was not only affected by the feeding time within 24 h,but also by environmental factors such as temperature and humidity.The feeding behavior of York pigs is a complex process of interaction between environmental factors and animal factors.
文摘This paper has compared variable selection method for multiple linear regression models that have both relative and non-relative variables in full model when predictor variables are highly correlated 0.999 . In this study two objective functions used in the Tabu Search are mean square error (MSE) and the mean absolute error (MAE). The results of Tabu Search are compared with the results obtained by stepwise regression method based on the hit percentage criterion. The simulations cover the both cases, without and with multicollinearity problems. For each situation, 1,000 iterations are examined by applying a different sample size n = 25 and 100 at 0.05 level of significance. Without multicollinearity problem, the hit percentages of the stepwise regression method and Tabu Search using the objective function of MSE are almost the same but slightly higher than the Tabu Search using the objective function of MAE. However with multicollinearity problem the hit percentages of the Tabu Search using both objective functions are higher than the hit percentage of the stepwise regression method.
文摘In this paper, an overview of an important feature in statistics field has shown: the stepwise multiple linear regression. Likewise, a link between stepwise multiple linear regression and earthquakes localization has been descripted. Precisely, the aim of this research is showing how stepwise multiple linear regression contributes to solution of earthquakes localization, describing its conditions of use in HYPO71PC, a software devoted to computation of seismic sources’ collocation. This aim is reached treating a concrete case, that is computation of earthquakes localization happening on Mount Vesuvius, Italy.
基金Supported by Guizhou Agricultural Research Project(QKH[2019]2279)Construction of Guizhou Breeding Livestock and Poultry Genetic Resources Testing Platform(QKZYD[2018]4015)Scientific and Technological Innovation Talent Team of Major Livestock and Poultry Genome Big Data Analysis and Application Research in Guizhou Province(QKHPTRC[2019]5615)。
文摘The total output value of mutton in Northwestern China has accounted for more than 60%of the total output value of animal husbandry over the years.It can be seen that the mutton industry in Northwest China not only plays a pivotal role in animal husbandry,but also plays an important role in Chinese agriculture.In this study,based on cost accounting theory,income-related theories and total factor productivity theory,using basic knowledge of statistics and economics,drawing on existing research results at home and abroad,and adopting a combination of qualitative analysis and quantitative analysis of SAS multiple stepwise regression,the changing trends of cost-benefit of mutton sheep breeding in Northwest agricultural and pastoral areas and influencing factors of production costs and production efficiency were investigated,aiming to provide reference for saving mutton sheep feeding material resources,reducing mutton sheep breeding costs,and improving mutton sheep breeding benefits.
文摘With the expansion of the gene expression profile database,in the case of as little as possible to lose information or to retain the most critical information,gene extraction has become a main direction for the scholars.This paper excludes 1561 irrelevant genes through the definition of weighted distance firstly,and then removes 252 redundant genes by Pearson's correlation coefficient.Finally by comparing the two methods,stepwise regression after clustering and only stepwise analysis,we obtain the best combination of 8 genes.
文摘Suppression effect in multiple regression analysis may be more common in research than what is currently recognized. We have reviewed several literatures of interest which treats the concept and types of suppressor variables. Also, we have highlighted systematic ways to identify suppression effect in multiple regressions using statistics such as: R2, sum of squares, regression weight and comparing zero-order correlations with Variance Inflation Factor (VIF) respectively. We also establish that suppression effect is a function of multicollinearity;however, a suppressor variable should only be allowed in a regression analysis if its VIF is less than five (5).
文摘There are various analytical, empirical and numerical methods to calculate groundwater inflow into tun- nels excavated in rocky media. Analytical methods have been widely applied in prediction of groundwa- ter inflow to tunnels due to their simplicity and practical base theory. Investigations show that the real amount of water infiltrating into jointed tunnels is much less than calculated amount using analytical methods and obtained results are very dependent on tunnel's geometry and environmental situations. In this study, using multiple regression analysis, a new empirical model for estimation of groundwater seepage into circular tunnels was introduced. Our data was acquired from field surveys and laboratory analysis of core samples. New regression variables were defined after perusing single and two variables relationship between groundwater seepage and other variables. Finally, an appropriate model for estima- tion of leakage was obtained using the stepwise algorithm. Statistics like R, R2, R2e and the histogram of residual values in the model represent a good reputation and fitness for this model to estimate the groundwater seepage into tunnels. The new experimental model was used for the test data and results were satisfactory. Therefore, multiple regression analysis is an effective and efficient way to estimate the groundwater seeoage into tunnels.
基金the China Scholarship Council(CSC)(201903250115)the National Natural Science Foundation of China(31972515)the China Agriculture Research System of MOF and MARA(CARS-09-P31).
文摘Understanding the spatial-temporal dynamics of crop nitrogen(N)use efficiency(NUE)and the relationship with explanatory environmental variables can support land-use management and policymaking.Nevertheless,the application of statistical models for evaluating the explanatory variables of space-time variation in crop NUE is still under-researched.In this study,stepwise multiple linear regression(SMLR)and Random Forest(RF)were used to evaluate the spatial and temporal variation of NUE indicators(i.e.,partial factor productivity of N(PFPN);partial nutrient balance of N(PNBN))at county scale in Northeast China(Heilongjiang,Liaoning and Jilin provinces)from 1990 to 2015.Explanatory variables included agricultural management practices,topography,climate,economy,soil and crop types.Results revealed that the PFPN was higher in the northern parts and lower in the center of the Northeast China and PNBN increased from southern to northern parts during the 1990–2015 period.The NUE indicators decreased with time in most counties during the study period.The model efficiency coefficients of the SMLR and RF models were 0.44 and 0.84 for PFPN,and 0.67 and 0.89 for PNBN,respectively.The RF model had higher relative importance of soil and climatic covariates and lower relative importance of crop covariates compared to the SMLR model.The planting area index of vegetables and beans,soil clay content,saturated water content,enhanced vegetation index in November&December,soil bulk density,and annual minimum temperature were the main explanatory variables for both NUE indicators.This is the first study to show the quantitative relative importance of explanatory variables for NUE at a county level in Northeast China using RF and SMLR.This novel study gives reference measurements to improve crop NUE which is one of the most effective means of managing N for sustainable development,ensuring food security,alleviating environmental degradation and increasing farmer’s profitability.
文摘Objectives: The objectives of this study are to use CART (Classification and regression tree) and step-wise regression to 1) define the predictors of quality of life in ACS (acute coronary syndrome) patients, using demographics, ACS symptoms, and anxiety as independent variables;and 2) discuss and compare the results of these two statistical approaches. Back- ground: In outcome studies of ACS, CART is a good alternative approach to linear regression;however, CART is rarely used. Methods: A descriptive survey design was used with 100 samples recruited. Result and Conclusions: Anxiety is the most significant predictor and also a stronger predictor than symptoms of ACS for the quality of life. The anxiety level patients experienced at the time heart attack occurred can be used to predict quality of life a month later. Furthermore, the majority of ACS patients experienced a moderate to high level of anxiety during a heart attack.
文摘In this paper, downscaling models are developed using various linear regression approaches namely direct, forward, backward and stepwise regression for downscaling of GCM output to predict mean monthly precipitation under IPCC SRES scenarios to watershed-basin scale in an arid region in India. The effectiveness of these regression approaches is evaluated through application to downscale the predictand for the Pichola lake region in Rajasthan state in India, which is considered to be a climatically sensitive region. The predictor variables are extracted from (1) the National Centers for Environmental Prediction (NCEP) reanalysis dataset for the period 1948–2000, and (2) the simulations from the third-generation Canadian Coupled Global Climate Model (CGCM3) for emission scenarios A1B, A2, B1 and COMMIT for the period 2001–2100. The selection of important predictor variables becomes a crucial issue for developing downscaling models since reanalysis data are based on wide range of meteorological measurements and observations. Direct regression was found to yield better performance among all other regression techniques explored in the present study. The results of downscaling models using both approaches show that precipitation is likely to increase in future for A1B, A2 and B1 scenarios, whereas no trend is discerned with the COMMIT.
基金supported by a grant from National Health Department of China(2008ZX10005-009)Roche company
文摘Objective This study was undertaken to investigate the influencing factors on serum ALT level and hepatitis C virus(HCV)RNA titer in chronic hepatitis C(CHC)patients.Methods All patients enrolled into this study were anti-HCV positive.Retrospective tracing method was applied to detect serum ALT level and HCV RNA titer and to collect general information of the patients such as genders,age groups,interferon medication history,infection pathways,height and weight.Then the multi-factor analysis was adopted with the application of binominal logistic regression mode.Results The abnormal rate of ALT level was positively correlated to HCV RNA and gender while negatively correlated to interferon medication history and age group,with Wald value of the 4 factors as 39.604,11.823,18.991 and 7.389,respectively.The positive rate of HCV RNA was negatively correlated to interferon medication history and gender while positively correlated to ALT level,with corresponding Wald value of the 3 factors as81.394,7.618 and 27.562,respectively.Conclusions The normal ALT level in HCV infected patients was associated with viral load,age,gender and interferon medication history,while the normal rate of HCV RNA titer was closely associated with gender,interferon medication history and ALT level.
基金supported by the Research Support Funds Grant (RSFG) program of Indiana University-Purdue University at Indianapolis
文摘Partial least squares (PLS) regression was applied to the Lunar Soft Characterization Consortium (LSCC) dataset for spectral estimation of TiO2. The LSCC dataset was split into a number of subsets including the low-Ti, high-Ti, total mare soils, total highland, Apollo 16, and Apollo 14 soils to investigate the effects of interfering minerals and nonlinearity on the PLS performance. The PLS weight loading vectors were analyzed through stepwise multiple regression analysis (SMRA) to identify mineral species driving and interfering the PLS performance. PLS exhibits high performance for estimating TiO2 for the LSCC low-Ti and high-Ti mare samples and both groups analyzed together. The results suggest that while the dominant TiO2-bearing minerals are few, additional PLS factors are required to compensate the effects on the important PLS factors of minerals that are not highly corrected to TiO2, to accommodate nonlinear relationships between reflectance and TiO2, and to correct inconsistent mineral-TiO2 correlations between the high-Ti and iow-Ti mare samples. Analysis of the LSCC highland soil samples indicates that the Apollo 16 soils are responsible for the large errors of TiO2 estimates when the soils are modeled with other subgroups. For the LSCC Apollo 16 samples, the dominant spectral effects of plagioclase over other dark minerals are primarily responsible for large errors of estimated TiO2. For the Apollo 14 soils, more accurate estimation for TiO2 is attributed to the posi- tive correlation between a major TiOe-bearing component and TiO2, explaining why the Apollo 14 soils follow the regression trend when analyzed with other soils groups.
基金Supported by the National Natural Science Foundation of China under Grant Nos.40675023 and 41065002the Key Natural Science Foundation of Guangxi Province under Grant No.0832019Z
文摘The prediction accuracy of the traditional stepwise regression prediction equation(SRPE)is affected by the multicollinearity among its predictors.This paper introduces the condition number analysis into the prediction modeling to minimize the multicollinearity in the SRPE.In the condition number prediction modeling,the condition number is used to select the combination of predictors with the lowest multicollinearity from the possible combinations of a number of candidate predictors(variables),and the selected combination is then used to construct the condition number regression prediction equation(CNRPE).This novel prediction modeling is performed in typhoon track prediction,which is a difficult task among meteorological disaster predictions.Six pairs of typhoon track latitude/longitude SRPEs and CNRPEs for July,August,and September are built by employing the traditional and the novel prediction modeling approaches,respectively,and by using a large number of identical modeling samples.The comparative analysis indicates that under the condition of the same candidate predictors(variables)and predictands(dependent variables),although the fitting accuracy of the novel prediction models used for the historical samples of South China Sea(SCS)typhoon tracks is slightly lower than that of the traditional prediction models,the prediction accuracy for the independent samples is obviously improved,with the averaged prediction error of the novel models for July,August,and September being 153.9 kin,which is 75.3 km smaller than that of the traditional models(a reduction of 33%).This is because the novel prediction modeling effectively minimizes the multicollinearity by computation and analysis of the condition number.It is shown further that when F=1.0,2.0,and 3.0,the average prediction errors of the traditional SRPEs are obviously larger than those of the CNRPEs.Moreover,extremely large and unreasonable prediction errors occur at some individual points of the typhoon track predicted by the SRPEs due to the multicollinearity existing in the combination of predictors.