To accurately model flows with shock waves using staggered-grid Lagrangian hydrodynamics, the artificial viscosity has to be introduced to convert kinetic energy into internal energy, thereby increasing the entropy ac...To accurately model flows with shock waves using staggered-grid Lagrangian hydrodynamics, the artificial viscosity has to be introduced to convert kinetic energy into internal energy, thereby increasing the entropy across shocks. Determining the appropriate strength of the artificial viscosity is an art and strongly depends on the particular problem and experience of the researcher. The objective of this study is to pose the problem of finding the appropriate strength of the artificial viscosity as an optimization problem and solve this problem using machine learning (ML) tools, specifically using surrogate models based on Gaussian Process regression (GPR) and Bayesian analysis. We describe the optimization method and discuss various practical details of its implementation. The shock-containing problems for which we apply this method all have been implemented in the LANL code FLAG (Burton in Connectivity structures and differencing techniques for staggered-grid free-Lagrange hydrodynamics, Tech. Rep. UCRL-JC-110555, Lawrence Livermore National Laboratory, Livermore, CA, 1992, 1992, in Consistent finite-volume discretization of hydrodynamic conservation laws for unstructured grids, Tech. Rep. CRL-JC-118788, Lawrence Livermore National Laboratory, Livermore, CA, 1992, 1994, Multidimensional discretization of conservation laws for unstructured polyhedral grids, Tech. Rep. UCRL-JC-118306, Lawrence Livermore National Laboratory, Livermore, CA, 1992, 1994, in FLAG, a multi-dimensional, multiple mesh, adaptive free-Lagrange, hydrodynamics code. In: NECDC, 1992). First, we apply ML to find optimal values to isolated shock problems of different strengths. Second, we apply ML to optimize the viscosity for a one-dimensional (1D) propagating detonation problem based on Zel’dovich-von Neumann-Doring (ZND) (Fickett and Davis in Detonation: theory and experiment. Dover books on physics. Dover Publications, Mineola, 2000) detonation theory using a reactive burn model. We compare results for default (currently used values in FLAG) and optimized values of the artificial viscosity for these problems demonstrating the potential for significant improvement in the accuracy of computations.展开更多
Carbon emissions have become a critical concern in the global effort to combat climate change,with each country or region contributing differently based on its economic structures,energy sources,and industrial activit...Carbon emissions have become a critical concern in the global effort to combat climate change,with each country or region contributing differently based on its economic structures,energy sources,and industrial activities.The factors influencing carbon emissions vary across countries and sectors.This study examined the factors influencing CO_(2)emissions in the 7 South American countries including Argentina,Brazil,Chile,Colombia,Ecuador,Peru,and Venezuela.We used the Seemingly Unrelated Regression(SUR)model to analyse the relationship of CO_(2)emissions with gross domestic product(GDP),renewable energy use,urbanization,industrialization,international tourism,agricultural productivity,and forest area based on data from 2000 to 2022.According to the SUR model,we found that GDP and industrialization had a moderate positive effect on CO_(2)emissions,whereas renewable energy use had a moderate negative effect on CO_(2)emissions.International tourism generally had a positive impact on CO_(2)emissions,while forest area tended to decrease CO_(2)emissions.Different variables had different effects on CO_(2)emissions in the 7 South American countries.In Argentina and Venezuela,GDP,international tourism,and agricultural productivity significantly affected CO_(2)emissions.In Colombia,GDP and international tourism had a negative impact on CO_(2)emissions.In Brazil,CO_(2)emissions were primarily driven by GDP,while in Chile,Ecuador,and Peru,international tourism had a negative effect on CO_(2)emissions.Overall,this study highlights the importance of country-specific strategies for reducing CO_(2)emissions and emphasizes the varying roles of these driving factors in shaping environmental quality in the 7 South American countries.展开更多
Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/appr...Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.展开更多
Combining a linear regression and a temperature budget formula, a multivariate regression model is proposed to parameterize and estimate sea surface temperature(SST) cooling induced by tropical cyclones(TCs). Thre...Combining a linear regression and a temperature budget formula, a multivariate regression model is proposed to parameterize and estimate sea surface temperature(SST) cooling induced by tropical cyclones(TCs). Three major dynamic and thermodynamic processes governing the TC-induced SST cooling(SSTC), vertical mixing, upwelling and heat flux, are parameterized empirically using a combination of multiple atmospheric and oceanic variables:sea surface height(SSH), wind speed, wind curl, TC translation speed and surface net heat flux. The regression model fits reasonably well with 10-year statistical observations/reanalysis data obtained from 100 selected TCs in the northwestern Pacific during 2001–2010, with an averaged fitting error of 0.07 and a mean absolute error of 0.72°C between diagnostic and observed SST cooling. The results reveal that the vertical mixing is overall the pre dominant process producing ocean SST cooling, accounting for 55% of the total cooling. The upwelling accounts for 18% of the total cooling and its maximum occurs near the TC center, associated with TC-induced Ekman pumping. The surface heat flux accounts for 26% of the total cooling, and its contribution increases towards the tropics and the continental shelf. The ocean thermal structures, represented by the SSH in the regression model,plays an important role in modulating the SST cooling pattern. The concept of the regression model can be applicable in TC weather prediction models to improve SST parameterization schemes.展开更多
BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale c...BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale cannot be fully understood due to lack of information.AIM To identify key factors that may explain the variability in case lethality across countries.METHODS We identified 21 Potential risk factors for coronavirus disease 2019(COVID-19)case fatality rate for all the countries with available data.We examined univariate relationships of each variable with case fatality rate(CFR),and all independent variables to identify candidate variables for our final multiple model.Multiple regression analysis technique was used to assess the strength of relationship.RESULTS The mean of COVID-19 mortality was 1.52±1.72%.There was a statistically significant inverse correlation between health expenditure,and number of computed tomography scanners per 1 million with CFR,and significant direct correlation was found between literacy,and air pollution with CFR.This final model can predict approximately 97%of the changes in CFR.CONCLUSION The current study recommends some new predictors explaining affect mortality rate.Thus,it could help decision-makers develop health policies to fight COVID-19.展开更多
The desired economics of hard rock surface mining is mainly determined by the parameters of process design which minimize the overall cost per tonne of the rock mined in drilling, blasting, handling and primary crushi...The desired economics of hard rock surface mining is mainly determined by the parameters of process design which minimize the overall cost per tonne of the rock mined in drilling, blasting, handling and primary crushing in given rockmass conditions. The most effective parameters of process design could be established based on the regression models of the cumulative influence of rockmass and mine design parameters on the overall cost per tonne of the rock drilled, blasted, handled and crushed. These models could be developed from the huge data accumulated worldwide on the costs per tonne of hard rock surface mining in drilling, blasting, handling and primary crushing vs the parameters of rockmass and mine design. This paper only dwelt on the development of regression models for oversize generation, blasthole productivity and blasting cost for iron ore surface mines, whose data is available. The SPSS standard statistical correlation – regression analysis software was used in the analysis. Interpretation of the models generated shows that the individual effects of the determinant rockmass and blast design parameters on oversize generation, blasthole productivity and blasting cost are all in compliance with the findings of other researchers and the theory of explosive rock fragmentation and could be used for the estimation of oversize generation, blasthole productivity and blasting cost in rockmass and blast design conditions similar to those of the iron ore surface mines examined in this study. However, the regression models obtained here could not be used alone for the optimization of blast design because most of the determinant parameters also have conflicting effect on the other processes of drilling, handling and primary crushing the blasted rock. Also, the quality and content of the regression models could be enhanced further by increasing the content of rockmass and blast design parameters and the volume of data considered in the regression analysis.展开更多
Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. ...Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. The above statement holds for West Texas, Midland, and Odessa Precisely. Two machine learning regression algorithms (Random Forest and XGBoost) were employed to develop models for the prediction of total dissolved solids (TDS) and sodium absorption ratio (SAR) for efficient water quality monitoring of two vital aquifers: Edward-Trinity (plateau), and Ogallala aquifers. These two aquifers have contributed immensely to providing water for different uses ranging from domestic, agricultural, industrial, etc. The data was obtained from the Texas Water Development Board (TWDB). The XGBoost and Random Forest models used in this study gave an accurate prediction of observed data (TDS and SAR) for both the Edward-Trinity (plateau) and Ogallala aquifers with the R<sup>2</sup> values consistently greater than 0.83. The Random Forest model gave a better prediction of TDS and SAR concentration with an average R, MAE, RMSE and MSE of 0.977, 0.015, 0.029 and 0.00, respectively. For the XGBoost, an average R, MAE, RMSE, and MSE of 0.953, 0.016, 0.037 and 0.00, respectively, were achieved. The overall performance of the models produced was impressive. From this study, we can clearly understand that Random Forest and XGBoost are appropriate for water quality prediction and monitoring in an area of high hydrocarbon activities like Midland and Odessa and West Texas at large.展开更多
Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water r...Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water resource planning, therefore, obtaining seasonal prediction models that allow these variations to be characterized in detail, it’s a concern, specially for island states. This research proposes the construction of statistical-dynamic models based on PCA regression methods. It is used as predictand the monthly precipitation accumulated, while the predictors (6) are extracted from the ECMWF-SEAS5 ensemble mean forecasts with a lag of one month with respect to the target month. In the construction of the models, two sequential training schemes are evaluated, obtaining that only the shorter preserves the seasonal characteristics of the predictand. The evaluation metrics used, where cell-point and dichotomous methodologies are combined, suggest that the predictors related to sea surface temperatures do not adequately represent the seasonal variability of the predictand, however, others such as the temperature at 850 hPa and the Outgoing Longwave Radiation are represented with a good approximation regardless of the model chosen. In this sense, the models built with the nearest neighbor methodology were the most efficient. Using the individual models with the best results, an ensemble is built that allows improving the individual skill of the models selected as members by correcting the underestimation of precipitation in the dynamic model during the wet season, although problems of overestimation persist for thresholds lower than 50 mm.展开更多
In this paper, a logistical regression statistical analysis (LR) is presented for a set of variables used in experimental measurements in reversed field pinch (RFP) machines, commonly known as “slinky mode” (SM), ob...In this paper, a logistical regression statistical analysis (LR) is presented for a set of variables used in experimental measurements in reversed field pinch (RFP) machines, commonly known as “slinky mode” (SM), observed to travel around the torus in Madison Symmetric Torus (MST). The LR analysis is used to utilize the modified Sine-Gordon dynamic equation model to predict with high confidence whether the slinky mode will lock or not lock when compared to the experimentally measured motion of the slinky mode. It is observed that under certain conditions, the slinky mode “locks” at or near the intersection of poloidal and/or toroidal gaps in MST. However, locked mode cease to travel around the torus;while unlocked mode keeps traveling without a change in the energy, making it hard to determine an exact set of conditions to predict locking/unlocking behaviour. The significant key model parameters determined by LR analysis are shown to improve the Sine-Gordon model’s ability to determine the locking/unlocking of magnetohydrodyamic (MHD) modes. The LR analysis of measured variables provides high confidence in anticipating locking versus unlocking of slinky mode proven by relational comparisons between simulations and the experimentally measured motion of the slinky mode in MST.展开更多
This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By re...This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By reviewing relevant theories and literature,qualitative prediction methods,regression prediction models,and other related theories were explored.Through the analysis of annual cigarette sales data and government procurement data in City A,a comprehensive understanding of the development of the tobacco industry and the economic trends of tobacco companies in the county was obtained.By predicting and analyzing the average price per box of cigarette sales across different years,corresponding prediction results were derived and compared with actual sales data.The prediction results indicate that the correlation coefficient between the average price per box of cigarette sales and government procurement is 0.982,implying that government procurement accounts for 96.4%of the changes in the average price per box of cigarettes.These findings offer an in-depth exploration of the relationship between the average price per box of cigarettes in City A and government procurement,providing a scientific foundation for corporate decision-making and market operations.展开更多
Mortality rate of gastric cancer is about 20.93/100000 which is the highest malignancy in China. The scientist of our country are at present interested in studying the postoperative survival model by multivariate anal...Mortality rate of gastric cancer is about 20.93/100000 which is the highest malignancy in China. The scientist of our country are at present interested in studying the postoperative survival model by multivariate analysis method just as stepwise regression model. The proportional hazard model initiated by Cox (1972) is more advanced than other regression method which is unneccessary to suppose the distribution of survival time and easy to analyse censoring data (the latter is difficult). This paper presented the first time application of Cox model in survival analysis of gastric cancer in China. The survival analysis system (SAS-Ⅰ) software complied by the author includes multivariate anlysis by Cox model, PV analysis and estimation of survival function which could provide useful information to surgeon for treatment of cancer patients.展开更多
What determines selection of the most cost effective parameters of hard rock surface mining is consideration of all alternative variants of mine design and the conflicting effect of their parameters on cost. Considera...What determines selection of the most cost effective parameters of hard rock surface mining is consideration of all alternative variants of mine design and the conflicting effect of their parameters on cost. Consideration could be realized based on the mathematical model of the cumulative influence of rockmass and mine design variables on the overall cost per ton of the hard rock drilled, blasted, hauled and primary crushed. Available works on the topic mostly dwelt on four processes of hard rock surface mining separately. This paper dwells on the theoretical part of a research proposed to enhance effectiveness in the selection of the parameters of hard rock surface mining design based on the regression model of overall cost per tonne of the rock mined fit on the determinant variations of rockmass and mine design. The regression model could be developed based on the statistical data generated by many of the hard rock surface mines operating in variable conditions of rockmass and mine design worldwide. Also, a regression model based general algorithm has been formulated for the development of software and computer aided selection of the most cost effective parameters of hard rock surface mining.展开更多
Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking s...Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking selection operator, an arithmetical crossover operator and a uniform mutation operator, and calculated the least-square error of the observed and computed values as its fitness function. The elitist strategy was used to improve the speed of the convergence. After that, the modified genetic algorithm was applied to reassess the coefficients of the regression model and a genetic regression model was set up. As an example, a slotted gravity dam in the Northeast of China was introduced. The computational results show that the genetic regression model can solve the under-fitting problems perfectly.展开更多
A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership fu...A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership functions. In the FORBFNN model, the weight coefficients of nodes in the hidden layer are identified by using the fuzzy expectation-maximization ( EM ) algorithm, whereas the optimal number of these nodes as well as the centers and widths of radial basis functions are automatically constructed by using a data-driven method. Namely, the method starts with an initial node, and then a new node is added in a hidden layer according to some rules. This procedure is not terminated until the model meets the preset requirements. The method considers both the accuracy and complexity of the model. Numerical simulation results show that the modeling method is effective, and the established model has high prediction accuracy.展开更多
The conventional single model strategy may be ill- suited due to the multiplicity of operation phases and system uncertainty. A novel global-local discriminant analysis (GLDA) based Gaussian process regression (GPR...The conventional single model strategy may be ill- suited due to the multiplicity of operation phases and system uncertainty. A novel global-local discriminant analysis (GLDA) based Gaussian process regression (GPR) approach is developed for the quality prediction of nonlinear and multiphase batch processes. After the collected data is preprocessed through batchwise unfolding, the hidden Markov model (HMM) is applied to identify different operation phases. A GLDA algorithm is also presented to extract the appropriate process variables highly correlated with the quality variables, decreasing the complexity of modeling. Besides, the multiple local GPR models are built in the reduced- dimensional space for all the identified operation phases. Furthermore, the HMM-based state estimation is used to classify each measurement sample of a test batch into a corresponding phase with the maximal likelihood estimation. Therefore, the local GPR model with respect to specific phase is selected for online prediction. The effectiveness of the proposed prediction approach is demonstrated through the multiphase penicillin fermentation process. The comparison results show that the proposed GLDA-GPR approach is superior to the regular GPR model and the GPR based on HMM (HMM-GPR) model.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to elimin...The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.展开更多
In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the...In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.展开更多
This paper presents a semiparametric adjustment method suitable for general cases.Assuming that the regularizer matrix is positive definite,the calculation method is discussed and the corresponding formulae are presen...This paper presents a semiparametric adjustment method suitable for general cases.Assuming that the regularizer matrix is positive definite,the calculation method is discussed and the corresponding formulae are presented.Finally,a simulated adjustment problem is constructed to explain the method given in this paper.The results from the semiparametric model and G_M model are compared.The results demonstrate that the model errors or the systematic errors of the observations can be detected correctly with the semiparametric estimate method.展开更多
Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence...Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence, a comprehensive map of landslide susceptibility is required which may be significantly helpful in reducing loss of property and human life. In this study, an integrated model of information value method and logistic regression is proposed by using their merits at maximum and overcoming their weaknesses, which may enhance precision and accuracy of landslide susceptibility assessment. A detailed and reliable landslide inventory with 1587 landslides was prepared and randomly divided into two groups,(i) training dataset and(ii) testing dataset. Eight distinct landslide conditioning factors including lithology, slope gradient, aspect, elevation, distance to drainages,distance to faults, distance to roads and vegetation coverage were selected for landslide susceptibility mapping. The produced landslide susceptibility maps were validated by the success rate and prediction rate curves. The validation results show that the success rate and the prediction rate of the integrated model are 81.7 % and 84.6 %, respectively, which indicate that the proposed integrated method is reliable to produce an accurate landslide susceptibility map and the results may be used for landslides management and mitigation.展开更多
基金This work was performed under the auspices of the National Nuclear Security Administration of the US Department of Energy at Los Alamos National Laboratory under Contract No.89233218CNA000001The Authors gratefully acknowledge the support of the US Department of Energy National Nuclear Security Administration Advanced Simulation and Computing Program.LA-UR-22-33159.
文摘To accurately model flows with shock waves using staggered-grid Lagrangian hydrodynamics, the artificial viscosity has to be introduced to convert kinetic energy into internal energy, thereby increasing the entropy across shocks. Determining the appropriate strength of the artificial viscosity is an art and strongly depends on the particular problem and experience of the researcher. The objective of this study is to pose the problem of finding the appropriate strength of the artificial viscosity as an optimization problem and solve this problem using machine learning (ML) tools, specifically using surrogate models based on Gaussian Process regression (GPR) and Bayesian analysis. We describe the optimization method and discuss various practical details of its implementation. The shock-containing problems for which we apply this method all have been implemented in the LANL code FLAG (Burton in Connectivity structures and differencing techniques for staggered-grid free-Lagrange hydrodynamics, Tech. Rep. UCRL-JC-110555, Lawrence Livermore National Laboratory, Livermore, CA, 1992, 1992, in Consistent finite-volume discretization of hydrodynamic conservation laws for unstructured grids, Tech. Rep. CRL-JC-118788, Lawrence Livermore National Laboratory, Livermore, CA, 1992, 1994, Multidimensional discretization of conservation laws for unstructured polyhedral grids, Tech. Rep. UCRL-JC-118306, Lawrence Livermore National Laboratory, Livermore, CA, 1992, 1994, in FLAG, a multi-dimensional, multiple mesh, adaptive free-Lagrange, hydrodynamics code. In: NECDC, 1992). First, we apply ML to find optimal values to isolated shock problems of different strengths. Second, we apply ML to optimize the viscosity for a one-dimensional (1D) propagating detonation problem based on Zel’dovich-von Neumann-Doring (ZND) (Fickett and Davis in Detonation: theory and experiment. Dover books on physics. Dover Publications, Mineola, 2000) detonation theory using a reactive burn model. We compare results for default (currently used values in FLAG) and optimized values of the artificial viscosity for these problems demonstrating the potential for significant improvement in the accuracy of computations.
文摘Carbon emissions have become a critical concern in the global effort to combat climate change,with each country or region contributing differently based on its economic structures,energy sources,and industrial activities.The factors influencing carbon emissions vary across countries and sectors.This study examined the factors influencing CO_(2)emissions in the 7 South American countries including Argentina,Brazil,Chile,Colombia,Ecuador,Peru,and Venezuela.We used the Seemingly Unrelated Regression(SUR)model to analyse the relationship of CO_(2)emissions with gross domestic product(GDP),renewable energy use,urbanization,industrialization,international tourism,agricultural productivity,and forest area based on data from 2000 to 2022.According to the SUR model,we found that GDP and industrialization had a moderate positive effect on CO_(2)emissions,whereas renewable energy use had a moderate negative effect on CO_(2)emissions.International tourism generally had a positive impact on CO_(2)emissions,while forest area tended to decrease CO_(2)emissions.Different variables had different effects on CO_(2)emissions in the 7 South American countries.In Argentina and Venezuela,GDP,international tourism,and agricultural productivity significantly affected CO_(2)emissions.In Colombia,GDP and international tourism had a negative impact on CO_(2)emissions.In Brazil,CO_(2)emissions were primarily driven by GDP,while in Chile,Ecuador,and Peru,international tourism had a negative effect on CO_(2)emissions.Overall,this study highlights the importance of country-specific strategies for reducing CO_(2)emissions and emphasizes the varying roles of these driving factors in shaping environmental quality in the 7 South American countries.
文摘Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.
基金The Major National Basic Research Development Program of China under contract No.2016YFA0202704the National Natural Science Foundation of China under contract Nos 41476008 and 41576018+1 种基金the Basic Fund of Chinese Academy of Meteorological Sciences under contract No.2017Z017the Strategic Priority Research Program of the Chinese Academy of Sciences under contract No.XDA11010303
文摘Combining a linear regression and a temperature budget formula, a multivariate regression model is proposed to parameterize and estimate sea surface temperature(SST) cooling induced by tropical cyclones(TCs). Three major dynamic and thermodynamic processes governing the TC-induced SST cooling(SSTC), vertical mixing, upwelling and heat flux, are parameterized empirically using a combination of multiple atmospheric and oceanic variables:sea surface height(SSH), wind speed, wind curl, TC translation speed and surface net heat flux. The regression model fits reasonably well with 10-year statistical observations/reanalysis data obtained from 100 selected TCs in the northwestern Pacific during 2001–2010, with an averaged fitting error of 0.07 and a mean absolute error of 0.72°C between diagnostic and observed SST cooling. The results reveal that the vertical mixing is overall the pre dominant process producing ocean SST cooling, accounting for 55% of the total cooling. The upwelling accounts for 18% of the total cooling and its maximum occurs near the TC center, associated with TC-induced Ekman pumping. The surface heat flux accounts for 26% of the total cooling, and its contribution increases towards the tropics and the continental shelf. The ocean thermal structures, represented by the SSH in the regression model,plays an important role in modulating the SST cooling pattern. The concept of the regression model can be applicable in TC weather prediction models to improve SST parameterization schemes.
文摘BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale cannot be fully understood due to lack of information.AIM To identify key factors that may explain the variability in case lethality across countries.METHODS We identified 21 Potential risk factors for coronavirus disease 2019(COVID-19)case fatality rate for all the countries with available data.We examined univariate relationships of each variable with case fatality rate(CFR),and all independent variables to identify candidate variables for our final multiple model.Multiple regression analysis technique was used to assess the strength of relationship.RESULTS The mean of COVID-19 mortality was 1.52±1.72%.There was a statistically significant inverse correlation between health expenditure,and number of computed tomography scanners per 1 million with CFR,and significant direct correlation was found between literacy,and air pollution with CFR.This final model can predict approximately 97%of the changes in CFR.CONCLUSION The current study recommends some new predictors explaining affect mortality rate.Thus,it could help decision-makers develop health policies to fight COVID-19.
文摘The desired economics of hard rock surface mining is mainly determined by the parameters of process design which minimize the overall cost per tonne of the rock mined in drilling, blasting, handling and primary crushing in given rockmass conditions. The most effective parameters of process design could be established based on the regression models of the cumulative influence of rockmass and mine design parameters on the overall cost per tonne of the rock drilled, blasted, handled and crushed. These models could be developed from the huge data accumulated worldwide on the costs per tonne of hard rock surface mining in drilling, blasting, handling and primary crushing vs the parameters of rockmass and mine design. This paper only dwelt on the development of regression models for oversize generation, blasthole productivity and blasting cost for iron ore surface mines, whose data is available. The SPSS standard statistical correlation – regression analysis software was used in the analysis. Interpretation of the models generated shows that the individual effects of the determinant rockmass and blast design parameters on oversize generation, blasthole productivity and blasting cost are all in compliance with the findings of other researchers and the theory of explosive rock fragmentation and could be used for the estimation of oversize generation, blasthole productivity and blasting cost in rockmass and blast design conditions similar to those of the iron ore surface mines examined in this study. However, the regression models obtained here could not be used alone for the optimization of blast design because most of the determinant parameters also have conflicting effect on the other processes of drilling, handling and primary crushing the blasted rock. Also, the quality and content of the regression models could be enhanced further by increasing the content of rockmass and blast design parameters and the volume of data considered in the regression analysis.
文摘Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. The above statement holds for West Texas, Midland, and Odessa Precisely. Two machine learning regression algorithms (Random Forest and XGBoost) were employed to develop models for the prediction of total dissolved solids (TDS) and sodium absorption ratio (SAR) for efficient water quality monitoring of two vital aquifers: Edward-Trinity (plateau), and Ogallala aquifers. These two aquifers have contributed immensely to providing water for different uses ranging from domestic, agricultural, industrial, etc. The data was obtained from the Texas Water Development Board (TWDB). The XGBoost and Random Forest models used in this study gave an accurate prediction of observed data (TDS and SAR) for both the Edward-Trinity (plateau) and Ogallala aquifers with the R<sup>2</sup> values consistently greater than 0.83. The Random Forest model gave a better prediction of TDS and SAR concentration with an average R, MAE, RMSE and MSE of 0.977, 0.015, 0.029 and 0.00, respectively. For the XGBoost, an average R, MAE, RMSE, and MSE of 0.953, 0.016, 0.037 and 0.00, respectively, were achieved. The overall performance of the models produced was impressive. From this study, we can clearly understand that Random Forest and XGBoost are appropriate for water quality prediction and monitoring in an area of high hydrocarbon activities like Midland and Odessa and West Texas at large.
文摘Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water resource planning, therefore, obtaining seasonal prediction models that allow these variations to be characterized in detail, it’s a concern, specially for island states. This research proposes the construction of statistical-dynamic models based on PCA regression methods. It is used as predictand the monthly precipitation accumulated, while the predictors (6) are extracted from the ECMWF-SEAS5 ensemble mean forecasts with a lag of one month with respect to the target month. In the construction of the models, two sequential training schemes are evaluated, obtaining that only the shorter preserves the seasonal characteristics of the predictand. The evaluation metrics used, where cell-point and dichotomous methodologies are combined, suggest that the predictors related to sea surface temperatures do not adequately represent the seasonal variability of the predictand, however, others such as the temperature at 850 hPa and the Outgoing Longwave Radiation are represented with a good approximation regardless of the model chosen. In this sense, the models built with the nearest neighbor methodology were the most efficient. Using the individual models with the best results, an ensemble is built that allows improving the individual skill of the models selected as members by correcting the underestimation of precipitation in the dynamic model during the wet season, although problems of overestimation persist for thresholds lower than 50 mm.
文摘In this paper, a logistical regression statistical analysis (LR) is presented for a set of variables used in experimental measurements in reversed field pinch (RFP) machines, commonly known as “slinky mode” (SM), observed to travel around the torus in Madison Symmetric Torus (MST). The LR analysis is used to utilize the modified Sine-Gordon dynamic equation model to predict with high confidence whether the slinky mode will lock or not lock when compared to the experimentally measured motion of the slinky mode. It is observed that under certain conditions, the slinky mode “locks” at or near the intersection of poloidal and/or toroidal gaps in MST. However, locked mode cease to travel around the torus;while unlocked mode keeps traveling without a change in the energy, making it hard to determine an exact set of conditions to predict locking/unlocking behaviour. The significant key model parameters determined by LR analysis are shown to improve the Sine-Gordon model’s ability to determine the locking/unlocking of magnetohydrodyamic (MHD) modes. The LR analysis of measured variables provides high confidence in anticipating locking versus unlocking of slinky mode proven by relational comparisons between simulations and the experimentally measured motion of the slinky mode in MST.
基金National Social Science Fund Project“Research on the Operational Risks and Prevention of Government Procurement of Community Services Project System”(Project No.21CSH018)Research and Application of SDM Cigarette Supply Strategy Based on Consumer Data Analysis(Project No.2023ASXM07)。
文摘This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By reviewing relevant theories and literature,qualitative prediction methods,regression prediction models,and other related theories were explored.Through the analysis of annual cigarette sales data and government procurement data in City A,a comprehensive understanding of the development of the tobacco industry and the economic trends of tobacco companies in the county was obtained.By predicting and analyzing the average price per box of cigarette sales across different years,corresponding prediction results were derived and compared with actual sales data.The prediction results indicate that the correlation coefficient between the average price per box of cigarette sales and government procurement is 0.982,implying that government procurement accounts for 96.4%of the changes in the average price per box of cigarettes.These findings offer an in-depth exploration of the relationship between the average price per box of cigarettes in City A and government procurement,providing a scientific foundation for corporate decision-making and market operations.
文摘Mortality rate of gastric cancer is about 20.93/100000 which is the highest malignancy in China. The scientist of our country are at present interested in studying the postoperative survival model by multivariate analysis method just as stepwise regression model. The proportional hazard model initiated by Cox (1972) is more advanced than other regression method which is unneccessary to suppose the distribution of survival time and easy to analyse censoring data (the latter is difficult). This paper presented the first time application of Cox model in survival analysis of gastric cancer in China. The survival analysis system (SAS-Ⅰ) software complied by the author includes multivariate anlysis by Cox model, PV analysis and estimation of survival function which could provide useful information to surgeon for treatment of cancer patients.
文摘What determines selection of the most cost effective parameters of hard rock surface mining is consideration of all alternative variants of mine design and the conflicting effect of their parameters on cost. Consideration could be realized based on the mathematical model of the cumulative influence of rockmass and mine design variables on the overall cost per ton of the hard rock drilled, blasted, hauled and primary crushed. Available works on the topic mostly dwelt on four processes of hard rock surface mining separately. This paper dwells on the theoretical part of a research proposed to enhance effectiveness in the selection of the parameters of hard rock surface mining design based on the regression model of overall cost per tonne of the rock mined fit on the determinant variations of rockmass and mine design. The regression model could be developed based on the statistical data generated by many of the hard rock surface mines operating in variable conditions of rockmass and mine design worldwide. Also, a regression model based general algorithm has been formulated for the development of software and computer aided selection of the most cost effective parameters of hard rock surface mining.
文摘Under-fitting problems usually occur in regression models for dam safety monitoring.To overcome the local convergence of the regression, a genetic algorithm (GA) was proposed using a real parameter coding, a ranking selection operator, an arithmetical crossover operator and a uniform mutation operator, and calculated the least-square error of the observed and computed values as its fitness function. The elitist strategy was used to improve the speed of the convergence. After that, the modified genetic algorithm was applied to reassess the coefficients of the regression model and a genetic regression model was set up. As an example, a slotted gravity dam in the Northeast of China was introduced. The computational results show that the genetic regression model can solve the under-fitting problems perfectly.
基金The National Natural Science Foundation of China(No.51106025,51106027,51036002)Specialized Research Fund for the Doctoral Program of Higher Education(No.20130092110061)the Youth Foundation of Nanjing Institute of Technology(No.QKJA201303)
文摘A fuzzy observations-based radial basis function neural network (FORBFNN) is presented for modeling nonlinear systems in which the observations of response are imprecise but can be represented as fuzzy membership functions. In the FORBFNN model, the weight coefficients of nodes in the hidden layer are identified by using the fuzzy expectation-maximization ( EM ) algorithm, whereas the optimal number of these nodes as well as the centers and widths of radial basis functions are automatically constructed by using a data-driven method. Namely, the method starts with an initial node, and then a new node is added in a hidden layer according to some rules. This procedure is not terminated until the model meets the preset requirements. The method considers both the accuracy and complexity of the model. Numerical simulation results show that the modeling method is effective, and the established model has high prediction accuracy.
基金The Fundamental Research Funds for the Central Universities(No.JUDCF12027,JUSRP51323B)the Scientific Innovation Research of College Graduates in Jiangsu Province(No.CXLX12_0734)
文摘The conventional single model strategy may be ill- suited due to the multiplicity of operation phases and system uncertainty. A novel global-local discriminant analysis (GLDA) based Gaussian process regression (GPR) approach is developed for the quality prediction of nonlinear and multiphase batch processes. After the collected data is preprocessed through batchwise unfolding, the hidden Markov model (HMM) is applied to identify different operation phases. A GLDA algorithm is also presented to extract the appropriate process variables highly correlated with the quality variables, decreasing the complexity of modeling. Besides, the multiple local GPR models are built in the reduced- dimensional space for all the identified operation phases. Furthermore, the HMM-based state estimation is used to classify each measurement sample of a test batch into a corresponding phase with the maximal likelihood estimation. Therefore, the local GPR model with respect to specific phase is selected for online prediction. The effectiveness of the proposed prediction approach is demonstrated through the multiphase penicillin fermentation process. The comparison results show that the proposed GLDA-GPR approach is superior to the regular GPR model and the GPR based on HMM (HMM-GPR) model.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金supported by the National Natural Science Foundation of China(71071077)the Ministry of Education Key Project of National Educational Science Planning(DFA090215)+1 种基金China Postdoctoral Science Foundation(20100481137)Funding of Jiangsu Innovation Program for Graduate Education(CXZZ11-0226)
文摘The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.
基金Project supported by the National Natural Science Foundation of China (Grant No 60573065)the Natural Science Foundation of Shandong Province,China (Grant No Y2007G33)the Key Subject Research Foundation of Shandong Province,China(Grant No XTD0708)
文摘In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.
文摘This paper presents a semiparametric adjustment method suitable for general cases.Assuming that the regularizer matrix is positive definite,the calculation method is discussed and the corresponding formulae are presented.Finally,a simulated adjustment problem is constructed to explain the method given in this paper.The results from the semiparametric model and G_M model are compared.The results demonstrate that the model errors or the systematic errors of the observations can be detected correctly with the semiparametric estimate method.
基金supported by the Project of the 12th Five-year National Sci-Tech Support Plan of China(2011BAK12B09)China Special Project of Basic Work of Science and Technology(2011FY110100-2)
文摘Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence, a comprehensive map of landslide susceptibility is required which may be significantly helpful in reducing loss of property and human life. In this study, an integrated model of information value method and logistic regression is proposed by using their merits at maximum and overcoming their weaknesses, which may enhance precision and accuracy of landslide susceptibility assessment. A detailed and reliable landslide inventory with 1587 landslides was prepared and randomly divided into two groups,(i) training dataset and(ii) testing dataset. Eight distinct landslide conditioning factors including lithology, slope gradient, aspect, elevation, distance to drainages,distance to faults, distance to roads and vegetation coverage were selected for landslide susceptibility mapping. The produced landslide susceptibility maps were validated by the success rate and prediction rate curves. The validation results show that the success rate and the prediction rate of the integrated model are 81.7 % and 84.6 %, respectively, which indicate that the proposed integrated method is reliable to produce an accurate landslide susceptibility map and the results may be used for landslides management and mitigation.