The extended kernel ridge regression(EKRR)method with odd-even effects was adopted to improve the description of the nuclear charge radius using five commonly used nuclear models.These are:(i)the isospin-dependent A^(...The extended kernel ridge regression(EKRR)method with odd-even effects was adopted to improve the description of the nuclear charge radius using five commonly used nuclear models.These are:(i)the isospin-dependent A^(1∕3) formula,(ii)relativistic continuum Hartree-Bogoliubov(RCHB)theory,(iii)Hartree-Fock-Bogoliubov(HFB)model HFB25,(iv)the Weizsacker-Skyrme(WS)model WS*,and(v)HFB25*model.In the last two models,the charge radii were calculated using a five-parameter formula with the nuclear shell corrections and deformations obtained from the WS and HFB25 models,respectively.For each model,the resultant root-mean-square deviation for the 1014 nuclei with proton number Z≥8 can be significantly reduced to 0.009-0.013 fm after considering the modification with the EKRR method.The best among them was the RCHB model,with a root-mean-square deviation of 0.0092 fm.The extrapolation abilities of the KRR and EKRR methods for the neutron-rich region were examined,and it was found that after considering the odd-even effects,the extrapolation power was improved compared with that of the original KRR method.The strong odd-even staggering of nuclear charge radii of Ca and Cu isotopes and the abrupt kinks across the neutron N=126 and 82 shell closures were also calculated and could be reproduced quite well by calculations using the EKRR method.展开更多
The development of prediction supports is a critical step in information systems engineering in this era defined by the knowledge economy, the hub of which is big data. Currently, the lack of a predictive model, wheth...The development of prediction supports is a critical step in information systems engineering in this era defined by the knowledge economy, the hub of which is big data. Currently, the lack of a predictive model, whether qualitative or quantitative, depending on a company’s areas of intervention can handicap or weaken its competitive capacities, endangering its survival. In terms of quantitative prediction, depending on the efficacy criteria, a variety of methods and/or tools are available. The multiple linear regression method is one of the methods used for this purpose. A linear regression model is a regression model of an explained variable on one or more explanatory variables in which the function that links the explanatory variables to the explained variable has linear parameters. The purpose of this work is to demonstrate how to use multiple linear regressions, which is one aspect of decisional mathematics. The use of multiple linear regressions on random data, which can be replaced by real data collected by or from organizations, provides decision makers with reliable data knowledge. As a result, machine learning methods can provide decision makers with relevant and trustworthy data. The main goal of this article is therefore to define the objective function on which the influencing factors for its optimization will be defined using the linear regression method.展开更多
It remains challenging to effectively estimate the remaining capacity of the secondary lithium-ion batteries that have been widely adopted for consumer electronics,energy storage,and electric vehicles.Herein,by integr...It remains challenging to effectively estimate the remaining capacity of the secondary lithium-ion batteries that have been widely adopted for consumer electronics,energy storage,and electric vehicles.Herein,by integrating regular real-time current short pulse tests with data-driven Gaussian process regression algorithm,an efficient battery estimation has been successfully developed and validated for batteries with capacity ranging from 100%of the state of health(SOH)to below 50%,reaching an average accuracy as high as 95%.Interestingly,the proposed pulse test strategy for battery capacity measurement could reduce test time by more than 80%compared with regular long charge/discharge tests.The short-term features of the current pulse test were selected for an optimal training process.Data at different voltage stages and state of charge(SOC)are collected and explored to find the most suitable estimation model.In particular,we explore the validity of five different machine-learning methods for estimating capacity driven by pulse features,whereas Gaussian process regression with Matern kernel performs the best,providing guidance for future exploration.The new strategy of combining short pulse tests with machine-learning algorithms could further open window for efficiently forecasting lithium-ion battery remaining capacity.展开更多
Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence...Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence, a comprehensive map of landslide susceptibility is required which may be significantly helpful in reducing loss of property and human life. In this study, an integrated model of information value method and logistic regression is proposed by using their merits at maximum and overcoming their weaknesses, which may enhance precision and accuracy of landslide susceptibility assessment. A detailed and reliable landslide inventory with 1587 landslides was prepared and randomly divided into two groups,(i) training dataset and(ii) testing dataset. Eight distinct landslide conditioning factors including lithology, slope gradient, aspect, elevation, distance to drainages,distance to faults, distance to roads and vegetation coverage were selected for landslide susceptibility mapping. The produced landslide susceptibility maps were validated by the success rate and prediction rate curves. The validation results show that the success rate and the prediction rate of the integrated model are 81.7 % and 84.6 %, respectively, which indicate that the proposed integrated method is reliable to produce an accurate landslide susceptibility map and the results may be used for landslides management and mitigation.展开更多
In this study, a multivariate local quadratic polynomial regression(MLQPR) method is proposed to design a model for the sludge volume index(SVI). In MLQPR, a quadratic polynomial regression function is established to ...In this study, a multivariate local quadratic polynomial regression(MLQPR) method is proposed to design a model for the sludge volume index(SVI). In MLQPR, a quadratic polynomial regression function is established to describe the relationship between SVI and the relative variables, and the important terms of the quadratic polynomial regression function are determined by the significant test of the corresponding coefficients. Moreover, a local estimation method is introduced to adjust the weights of the quadratic polynomial regression function to improve the model accuracy. Finally, the proposed method is applied to predict the SVI values in a real wastewater treatment process(WWTP). The experimental results demonstrate that the proposed MLQPR method has faster testing speed and more accurate results than some existing methods.展开更多
A Mobile Ad-hoc NETwork(MANET)contains numerous mobile nodes,and it forms a structure-less network associated with wireless links.But,the node movement is the key feature of MANETs;hence,the quick action of the nodes ...A Mobile Ad-hoc NETwork(MANET)contains numerous mobile nodes,and it forms a structure-less network associated with wireless links.But,the node movement is the key feature of MANETs;hence,the quick action of the nodes guides a link failure.This link failure creates more data packet drops that can cause a long time delay.As a result,measuring accurate link failure time is the key factor in the MANET.This paper presents a Fuzzy Linear Regression Method to measure Link Failure(FLRLF)and provide an optimal route in the MANET-Internet of Things(IoT).This work aims to predict link failure and improve routing efficiency in MANET.The Fuzzy Linear Regression Method(FLRM)measures the long lifespan link based on the link failure.The mobile node group is built by the Received Signal Strength(RSS).The Hill Climbing(HC)method selects the Group Leader(GL)based on node mobility,node degree and node energy.Additionally,it uses a Data Gathering node forward the infor-mation from GL to the sink node through multiple GL.The GL is identified by linking lifespan and energy using the Particle Swarm Optimization(PSO)algo-rithm.The simulation results demonstrate that the FLRLF approach increases the GL lifespan and minimizes the link failure time in the MANET.展开更多
Deformation modulus of rock mass is one of the input parameters to most rock engineering designs and constructions.The field tests for determination of deformation modulus are cumbersome,expensive and time-consuming.T...Deformation modulus of rock mass is one of the input parameters to most rock engineering designs and constructions.The field tests for determination of deformation modulus are cumbersome,expensive and time-consuming.This has prompted the development of various regression equations to estimate deformation modulus from results of rock mass classifications,with rock mass rating(RMR)being one of the frequently used classifications.The regression equations are of different types ranging from linear to nonlinear functions like power and exponential.Bayesian method has recently been developed to incorporate regression equations into a Bayesian framework to provide better estimates of geotechnical properties.The question of whether Bayesian method improves the estimation of geotechnical properties in all circumstances remains open.Therefore,a comparative study was conducted to assess the performances of regression and Bayesian methods when they are used to characterize deformation modulus from the same set of RMR data obtained from two project sites.The study also investigated the performance of different types of regression equations in estimation of the deformation modulus.Statistics,probability distributions and prediction indicators were used to assess the performances of regression and Bayesian methods and different types of regression equations.It was found that power and exponential types of regression equations provide a better estimate than linear regression equations.In addition,it was discovered that the ability of the Bayesian method to provide better estimates of deformation modulus than regression method depends on the quality and quantity of input data as well as the type of the regression equation.展开更多
This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s in...This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s input and output are fuzzy numbers, and the regression coefficients are clear numbers. This paper considers the parameter estimation and impact analysis based on data deletion. Through the study of example and comparison with other models, it can be concluded that the model in this paper is applied easily and better.展开更多
According to the appearing of isosbestic point in the absorption spectra of Ho/Y-Tribromoarsenazo (TBA)systems,the complexation reaction is considered to be M+nL=ML_n.A method has been proposed based on it for calcula...According to the appearing of isosbestic point in the absorption spectra of Ho/Y-Tribromoarsenazo (TBA)systems,the complexation reaction is considered to be M+nL=ML_n.A method has been proposed based on it for calculating the mole fraction of free complexing agent in the solutions from spectral data.and two linear regression formula have been introduced to determine the composition,the molar absorptivity,the conditional stability constant of the complex and the concentration of the complexing agent. This method has been used in Ho-TBA and Y-TBA systems.Ho^(3+)and Y^(3+)react with TBA and form 1: 2 complexes in HCl-NaAc buffer solution at pH 3.80.Their molar absorptivities determined are 1.03×10~8 and 1.10×10~8 cm^2·mol^(-1),and the conditional stability constants(logβ_2)are 11.37 and 11.15 respectively.After considering the pH effect in TBA complexing,their stability constants(log β_2^(ahs))are 43.23 and 43.01. respectively.The new method is adaptable to such systems where the accurate concentration of the complexing agent can not be known conveniently.展开更多
A new method,dual-series linear regression method,has been used to study the complexation equilibrium of praseodymium(Pr^(3+))with tribromoarsenazo(TBA)without knowing the accurate concentra- tion of the complexing ag...A new method,dual-series linear regression method,has been used to study the complexation equilibrium of praseodymium(Pr^(3+))with tribromoarsenazo(TBA)without knowing the accurate concentra- tion of the complexing agent TBA.In 1.2 mol/L HCl solution, Pr^(3+)reacts with TBA and forms 1:3 com- plex,the conditional stability constant(lgβ_3)of the complex determined is 15.47,and its molar absorptivity(ε_3^(630))is 1.48×10~5 L·mol^(-1)·cm^(-1).展开更多
Based on the model structure of the influence coefficient method analyzed in depth by matrix theory ,it is explained the reason why the unreasonable and instable correction masses with bigger MSE are obtained by LS in...Based on the model structure of the influence coefficient method analyzed in depth by matrix theory ,it is explained the reason why the unreasonable and instable correction masses with bigger MSE are obtained by LS influence coefficient method when there are correlation planes in the dynamic balancing. It also presencd the new ridge regression method for solving correction masses according to the Tikhonov regularization theory, and described the reason why the ridge regression can eliminate the disadvantage of the LS method. Applying this new method to dynamic balancing of gas turbine, it is found that this method is superior to the LS method when influence coefficient matrix is ill-conditioned,the minimal correction masses and residual vibration are obtained in the dynamic balancing of rotors.展开更多
The development of defect prediction plays a significant role in improving software quality. Such predictions are used to identify defective modules before the testing and to minimize the time and cost. The software w...The development of defect prediction plays a significant role in improving software quality. Such predictions are used to identify defective modules before the testing and to minimize the time and cost. The software with defects negatively impacts operational costs and finally affects customer satisfaction. Numerous approaches exist to predict software defects. However, the timely and accurate software bugs are the major challenging issues. To improve the timely and accurate software defect prediction, a novel technique called Nonparametric Statistical feature scaled QuAdratic regressive convolution Deep nEural Network (SQADEN) is introduced. The proposed SQADEN technique mainly includes two major processes namely metric or feature selection and classification. First, the SQADEN uses the nonparametric statistical Torgerson–Gower scaling technique for identifying the relevant software metrics by measuring the similarity using the dice coefficient. The feature selection process is used to minimize the time complexity of software fault prediction. With the selected metrics, software fault perdition with the help of the Quadratic Censored regressive convolution deep neural network-based classification. The deep learning classifier analyzes the training and testing samples using the contingency correlation coefficient. The softstep activation function is used to provide the final fault prediction results. To minimize the error, the Nelder–Mead method is applied to solve non-linear least-squares problems. Finally, accurate classification results with a minimum error are obtained at the output layer. Experimental evaluation is carried out with different quantitative metrics such as accuracy, precision, recall, F-measure, and time complexity. The analyzed results demonstrate the superior performance of our proposed SQADEN technique with maximum accuracy, sensitivity and specificity by 3%, 3%, 2% and 3% and minimum time and space by 13% and 15% when compared with the two state-of-the-art methods.展开更多
The purpose of this study was to examine the burnout levels of research assistants in Ondokuz Mayis University and to examine the results of multiple linear regression model based on the results obtained from Maslach ...The purpose of this study was to examine the burnout levels of research assistants in Ondokuz Mayis University and to examine the results of multiple linear regression model based on the results obtained from Maslach Burnout Scale with Jackknife Method in terms of validity and generalizability. To do this, a questionnaire was given to 11 research assistants working at Ondokuz Mayis University and the burnout scores of this questionnaire were taken as the dependent variable of the multiple linear regression model. The variable of burnout was explained with the variables of age, weekly hours of classes taught, monthly average credit card debt, numbers of published articles and reports, gender, marital status, number of children and the departments of the research assistants. Dummy variables were assigned to the variables of gender, marital status, number of children and the departments of the research assistants and thus, they were made quantitative. The significance of the model as a result of multiple linear regressions was examined through backward elimination method. After this, for the five explanatory variables which influenced the variable of burnout, standardized model coefficients and coefficients of determination, and 95% confidence intervals of these values were estimated through Jackknife Method and the generalizability of the parameter estimation results of these variables on population was researched.展开更多
In this research, the result of the cloud seeding over Yazd province during three months of February, March and April in 1999 has been evaluated using the historical regression method. Hereupon, the rain-gages in Yazd...In this research, the result of the cloud seeding over Yazd province during three months of February, March and April in 1999 has been evaluated using the historical regression method. Hereupon, the rain-gages in Yazd province as the target stations and the rain-gages of the neighboring provinces as the control stations have been selected. The rainfall averages for the three aforementioned months through 25 years (1973-1997) in all control and target stations have been calculated. In the next step, the correlations between the rainfalls of control and target stations have been estimated about 75%, which indicates a good consistency in order to use the historical regression. Then, through the obtained liner correlation equation between the control and target stations the precipitation amount for February, March and April in 1999, over the target region (Yazd province) was estimated about 27.57 mm, whiles the observed amount was 34.23 mm. In fact the precipitation increasing around 19.5% over Yazd province confirmed the success of this cloud seeding project.展开更多
In this paper, we propose a Fast Iteration Method for solving mixture regression problem, which can be treated as a model-based clustering. Compared to the EM algorithm, the proposed method is faster, more flexible an...In this paper, we propose a Fast Iteration Method for solving mixture regression problem, which can be treated as a model-based clustering. Compared to the EM algorithm, the proposed method is faster, more flexible and can solve mixture regression problem with different error distributions (i.e. Laplace and t distribution). Extensive numeric experiments show that our proposed method has better performance on randomly simulations and real data.展开更多
With determination micro-Fe by 1, 10-phenanthroline spectrophotometry for example, they are systematically introduced the combinatorial measurement and regression analysis method application about metheodic principle,...With determination micro-Fe by 1, 10-phenanthroline spectrophotometry for example, they are systematically introduced the combinatorial measurement and regression analysis method application about metheodic principle, operation step and data processing in the instrumental analysis, including: calibration curve best linear equation is set up, measurand best linear equation is set up, and calculation of best value of a concentration. The results showed that mean of thrice determination , s = 0 μg/mL, RSD = 0. Results of preliminary application are simply introduced in the basic instrumental analysis for atomic absorption spectrophotometry, ion-selective electrodes, coulometry and polarographic analysis and are contrasted to results of normal measurements.展开更多
This paper presents a technique for Medium Term Load Forecasting (MTLF) using Particle Swarm Optimization (PSO) algorithm based on Least Squares Regression Methods to forecast the electric loads of the Jordanian grid ...This paper presents a technique for Medium Term Load Forecasting (MTLF) using Particle Swarm Optimization (PSO) algorithm based on Least Squares Regression Methods to forecast the electric loads of the Jordanian grid for year of 2015. Linear, quadratic and exponential forecast models have been examined to perform this study and compared with the Auto Regressive (AR) model. MTLF models were influenced by the weather which should be considered when predicting the future peak load demand in terms of months and weeks. The main contribution for this paper is the conduction of MTLF study for Jordan on weekly and monthly basis using real data obtained from National Electric Power Company NEPCO. This study is aimed to develop practical models and algorithm techniques for MTLF to be used by the operators of Jordan power grid. The results are compared with the actual peak load data to attain minimum percentage error. The value of the forecasted weekly and monthly peak loads obtained from these models is examined using Least Square Error (LSE). Actual reported data from NEPCO are used to analyze the performance of the proposed approach and the results are reported and compared with the results obtained from PSO algorithm and AR model.展开更多
Heteroscedasticity and multicollinearity are serious problems when they exist in econometrics data. These problems exist as a result of violating the assumptions of equal variance between the error terms and that of i...Heteroscedasticity and multicollinearity are serious problems when they exist in econometrics data. These problems exist as a result of violating the assumptions of equal variance between the error terms and that of independence between the explanatory variables of the model. With these assumption violations, Ordinary Least Square Estimator</span><span style="font-family:""> </span><span style="font-family:""><span style="font-family:Verdana;">(OLS) will not give best linear unbiased, efficient and consistent estimator. In practice, there are several structures of heteroscedasticity and several methods of heteroscedasticity detection. For better estimation result, best heteroscedasticity detection methods must be determined for any structure of heteroscedasticity in the presence of multicollinearity between the explanatory variables of the model. In this paper we examine the effects of multicollinearity on type I error rates of some methods of heteroscedasticity detection in linear regression model in other to determine the best method of heteroscedasticity detection to use when both problems exist in the model. Nine heteroscedasticity detection methods were considered with seven heteroscedasticity structures. Simulation study was done via a Monte Carlo experiment on a multiple linear regression model with 3 explanatory variables. This experiment was conducted 1000 times with linear model parameters of </span><span style="white-space:nowrap;"><em><span style="font-family:Verdana;">β</span></em><sub><span style="font-family:Verdana;">0</span></sub><span style="font-family:Verdana;"> = 4 , </span><em><span style="font-family:Verdana;">β</span></em><sub><span style="font-family:Verdana;">1</span></sub><span style="font-family:Verdana;"> = 0.4 , </span><em><span style="font-family:Verdana;">β</span></em><sub><span style="font-family:Verdana;">2</span></sub><span style="font-family:Verdana;">= 1.5</span></span></span><span style="font-family:""><span style="font-family:Verdana;"> and </span><em style="font-family:""><span style="font-family:Verdana;">β</span><span style="font-family:Verdana;"><sub>3 </sub></span></em><span style="font-family:Verdana;">= 3.6</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">Five (5) </span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;">levels of</span><span style="white-space:nowrap;font-family:Verdana;"> </span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;">mulicollinearity </span></span><span style="font-family:Verdana;">are </span><span style="font-family:Verdana;">with seven</span><span style="font-family:""> </span><span style="font-family:Verdana;">(7) different sample sizes. The method’s performances were compared with the aids of set confidence interval (C.I</span><span style="font-family:Verdana;">.</span><span style="font-family:Verdana;">) criterion. Results showed that whenever multicollinearity exists in the model with any forms of heteroscedasticity structures, Breusch-Godfrey (BG) test is the best method to determine the existence of heteroscedasticity at all chosen levels of significance.展开更多
The fast spread of coronavirus disease(COVID-19)caused by SARSCoV-2 has become a pandemic and a serious threat to the world.As of May 30,2020,this disease had infected more than 6 million people globally,with hundreds...The fast spread of coronavirus disease(COVID-19)caused by SARSCoV-2 has become a pandemic and a serious threat to the world.As of May 30,2020,this disease had infected more than 6 million people globally,with hundreds of thousands of deaths.Therefore,there is an urgent need to predict confirmed cases so as to analyze the impact of COVID-19 and practice readiness in healthcare systems.This study uses gradient boosting regression(GBR)to build a trained model to predict the daily total confirmed cases of COVID-19.The GBR method can minimize the loss function of the training process and create a single strong learner from weak learners.Experiments are conducted on a dataset of daily confirmed COVID-19 cases from January 22,2020,to May 30,2020.The results are evaluated on a set of evaluation performance measures using 10-fold cross-validation to demonstrate the effectiveness of the GBR method.The results reveal that the GBR model achieves 0.00686 root mean square error,the lowest among several comparative models.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
基金This work was supported by the National Natural Science Foundation of China(Nos.11875027,11975096).
文摘The extended kernel ridge regression(EKRR)method with odd-even effects was adopted to improve the description of the nuclear charge radius using five commonly used nuclear models.These are:(i)the isospin-dependent A^(1∕3) formula,(ii)relativistic continuum Hartree-Bogoliubov(RCHB)theory,(iii)Hartree-Fock-Bogoliubov(HFB)model HFB25,(iv)the Weizsacker-Skyrme(WS)model WS*,and(v)HFB25*model.In the last two models,the charge radii were calculated using a five-parameter formula with the nuclear shell corrections and deformations obtained from the WS and HFB25 models,respectively.For each model,the resultant root-mean-square deviation for the 1014 nuclei with proton number Z≥8 can be significantly reduced to 0.009-0.013 fm after considering the modification with the EKRR method.The best among them was the RCHB model,with a root-mean-square deviation of 0.0092 fm.The extrapolation abilities of the KRR and EKRR methods for the neutron-rich region were examined,and it was found that after considering the odd-even effects,the extrapolation power was improved compared with that of the original KRR method.The strong odd-even staggering of nuclear charge radii of Ca and Cu isotopes and the abrupt kinks across the neutron N=126 and 82 shell closures were also calculated and could be reproduced quite well by calculations using the EKRR method.
文摘The development of prediction supports is a critical step in information systems engineering in this era defined by the knowledge economy, the hub of which is big data. Currently, the lack of a predictive model, whether qualitative or quantitative, depending on a company’s areas of intervention can handicap or weaken its competitive capacities, endangering its survival. In terms of quantitative prediction, depending on the efficacy criteria, a variety of methods and/or tools are available. The multiple linear regression method is one of the methods used for this purpose. A linear regression model is a regression model of an explained variable on one or more explanatory variables in which the function that links the explanatory variables to the explained variable has linear parameters. The purpose of this work is to demonstrate how to use multiple linear regressions, which is one aspect of decisional mathematics. The use of multiple linear regressions on random data, which can be replaced by real data collected by or from organizations, provides decision makers with reliable data knowledge. As a result, machine learning methods can provide decision makers with relevant and trustworthy data. The main goal of this article is therefore to define the objective function on which the influencing factors for its optimization will be defined using the linear regression method.
基金support from Shenzhen Municipal Development and Reform Commission(Grant Number:SDRC[2016]172)Shenzhen Science and Technology Program(Grant No.KQTD20170810150821146)Interdisciplinary Research and Innovation Fund of Tsinghua Shenzhen International Graduate School,and Shanghai Shun Feng Machinery Co.,Ltd.
文摘It remains challenging to effectively estimate the remaining capacity of the secondary lithium-ion batteries that have been widely adopted for consumer electronics,energy storage,and electric vehicles.Herein,by integrating regular real-time current short pulse tests with data-driven Gaussian process regression algorithm,an efficient battery estimation has been successfully developed and validated for batteries with capacity ranging from 100%of the state of health(SOH)to below 50%,reaching an average accuracy as high as 95%.Interestingly,the proposed pulse test strategy for battery capacity measurement could reduce test time by more than 80%compared with regular long charge/discharge tests.The short-term features of the current pulse test were selected for an optimal training process.Data at different voltage stages and state of charge(SOC)are collected and explored to find the most suitable estimation model.In particular,we explore the validity of five different machine-learning methods for estimating capacity driven by pulse features,whereas Gaussian process regression with Matern kernel performs the best,providing guidance for future exploration.The new strategy of combining short pulse tests with machine-learning algorithms could further open window for efficiently forecasting lithium-ion battery remaining capacity.
基金supported by the Project of the 12th Five-year National Sci-Tech Support Plan of China(2011BAK12B09)China Special Project of Basic Work of Science and Technology(2011FY110100-2)
文摘Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence, a comprehensive map of landslide susceptibility is required which may be significantly helpful in reducing loss of property and human life. In this study, an integrated model of information value method and logistic regression is proposed by using their merits at maximum and overcoming their weaknesses, which may enhance precision and accuracy of landslide susceptibility assessment. A detailed and reliable landslide inventory with 1587 landslides was prepared and randomly divided into two groups,(i) training dataset and(ii) testing dataset. Eight distinct landslide conditioning factors including lithology, slope gradient, aspect, elevation, distance to drainages,distance to faults, distance to roads and vegetation coverage were selected for landslide susceptibility mapping. The produced landslide susceptibility maps were validated by the success rate and prediction rate curves. The validation results show that the success rate and the prediction rate of the integrated model are 81.7 % and 84.6 %, respectively, which indicate that the proposed integrated method is reliable to produce an accurate landslide susceptibility map and the results may be used for landslides management and mitigation.
文摘In this study, a multivariate local quadratic polynomial regression(MLQPR) method is proposed to design a model for the sludge volume index(SVI). In MLQPR, a quadratic polynomial regression function is established to describe the relationship between SVI and the relative variables, and the important terms of the quadratic polynomial regression function are determined by the significant test of the corresponding coefficients. Moreover, a local estimation method is introduced to adjust the weights of the quadratic polynomial regression function to improve the model accuracy. Finally, the proposed method is applied to predict the SVI values in a real wastewater treatment process(WWTP). The experimental results demonstrate that the proposed MLQPR method has faster testing speed and more accurate results than some existing methods.
文摘A Mobile Ad-hoc NETwork(MANET)contains numerous mobile nodes,and it forms a structure-less network associated with wireless links.But,the node movement is the key feature of MANETs;hence,the quick action of the nodes guides a link failure.This link failure creates more data packet drops that can cause a long time delay.As a result,measuring accurate link failure time is the key factor in the MANET.This paper presents a Fuzzy Linear Regression Method to measure Link Failure(FLRLF)and provide an optimal route in the MANET-Internet of Things(IoT).This work aims to predict link failure and improve routing efficiency in MANET.The Fuzzy Linear Regression Method(FLRM)measures the long lifespan link based on the link failure.The mobile node group is built by the Received Signal Strength(RSS).The Hill Climbing(HC)method selects the Group Leader(GL)based on node mobility,node degree and node energy.Additionally,it uses a Data Gathering node forward the infor-mation from GL to the sink node through multiple GL.The GL is identified by linking lifespan and energy using the Particle Swarm Optimization(PSO)algo-rithm.The simulation results demonstrate that the FLRLF approach increases the GL lifespan and minimizes the link failure time in the MANET.
文摘Deformation modulus of rock mass is one of the input parameters to most rock engineering designs and constructions.The field tests for determination of deformation modulus are cumbersome,expensive and time-consuming.This has prompted the development of various regression equations to estimate deformation modulus from results of rock mass classifications,with rock mass rating(RMR)being one of the frequently used classifications.The regression equations are of different types ranging from linear to nonlinear functions like power and exponential.Bayesian method has recently been developed to incorporate regression equations into a Bayesian framework to provide better estimates of geotechnical properties.The question of whether Bayesian method improves the estimation of geotechnical properties in all circumstances remains open.Therefore,a comparative study was conducted to assess the performances of regression and Bayesian methods when they are used to characterize deformation modulus from the same set of RMR data obtained from two project sites.The study also investigated the performance of different types of regression equations in estimation of the deformation modulus.Statistics,probability distributions and prediction indicators were used to assess the performances of regression and Bayesian methods and different types of regression equations.It was found that power and exponential types of regression equations provide a better estimate than linear regression equations.In addition,it was discovered that the ability of the Bayesian method to provide better estimates of deformation modulus than regression method depends on the quality and quantity of input data as well as the type of the regression equation.
文摘This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s input and output are fuzzy numbers, and the regression coefficients are clear numbers. This paper considers the parameter estimation and impact analysis based on data deletion. Through the study of example and comparison with other models, it can be concluded that the model in this paper is applied easily and better.
文摘According to the appearing of isosbestic point in the absorption spectra of Ho/Y-Tribromoarsenazo (TBA)systems,the complexation reaction is considered to be M+nL=ML_n.A method has been proposed based on it for calculating the mole fraction of free complexing agent in the solutions from spectral data.and two linear regression formula have been introduced to determine the composition,the molar absorptivity,the conditional stability constant of the complex and the concentration of the complexing agent. This method has been used in Ho-TBA and Y-TBA systems.Ho^(3+)and Y^(3+)react with TBA and form 1: 2 complexes in HCl-NaAc buffer solution at pH 3.80.Their molar absorptivities determined are 1.03×10~8 and 1.10×10~8 cm^2·mol^(-1),and the conditional stability constants(logβ_2)are 11.37 and 11.15 respectively.After considering the pH effect in TBA complexing,their stability constants(log β_2^(ahs))are 43.23 and 43.01. respectively.The new method is adaptable to such systems where the accurate concentration of the complexing agent can not be known conveniently.
文摘A new method,dual-series linear regression method,has been used to study the complexation equilibrium of praseodymium(Pr^(3+))with tribromoarsenazo(TBA)without knowing the accurate concentra- tion of the complexing agent TBA.In 1.2 mol/L HCl solution, Pr^(3+)reacts with TBA and forms 1:3 com- plex,the conditional stability constant(lgβ_3)of the complex determined is 15.47,and its molar absorptivity(ε_3^(630))is 1.48×10~5 L·mol^(-1)·cm^(-1).
文摘Based on the model structure of the influence coefficient method analyzed in depth by matrix theory ,it is explained the reason why the unreasonable and instable correction masses with bigger MSE are obtained by LS influence coefficient method when there are correlation planes in the dynamic balancing. It also presencd the new ridge regression method for solving correction masses according to the Tikhonov regularization theory, and described the reason why the ridge regression can eliminate the disadvantage of the LS method. Applying this new method to dynamic balancing of gas turbine, it is found that this method is superior to the LS method when influence coefficient matrix is ill-conditioned,the minimal correction masses and residual vibration are obtained in the dynamic balancing of rotors.
文摘The development of defect prediction plays a significant role in improving software quality. Such predictions are used to identify defective modules before the testing and to minimize the time and cost. The software with defects negatively impacts operational costs and finally affects customer satisfaction. Numerous approaches exist to predict software defects. However, the timely and accurate software bugs are the major challenging issues. To improve the timely and accurate software defect prediction, a novel technique called Nonparametric Statistical feature scaled QuAdratic regressive convolution Deep nEural Network (SQADEN) is introduced. The proposed SQADEN technique mainly includes two major processes namely metric or feature selection and classification. First, the SQADEN uses the nonparametric statistical Torgerson–Gower scaling technique for identifying the relevant software metrics by measuring the similarity using the dice coefficient. The feature selection process is used to minimize the time complexity of software fault prediction. With the selected metrics, software fault perdition with the help of the Quadratic Censored regressive convolution deep neural network-based classification. The deep learning classifier analyzes the training and testing samples using the contingency correlation coefficient. The softstep activation function is used to provide the final fault prediction results. To minimize the error, the Nelder–Mead method is applied to solve non-linear least-squares problems. Finally, accurate classification results with a minimum error are obtained at the output layer. Experimental evaluation is carried out with different quantitative metrics such as accuracy, precision, recall, F-measure, and time complexity. The analyzed results demonstrate the superior performance of our proposed SQADEN technique with maximum accuracy, sensitivity and specificity by 3%, 3%, 2% and 3% and minimum time and space by 13% and 15% when compared with the two state-of-the-art methods.
文摘The purpose of this study was to examine the burnout levels of research assistants in Ondokuz Mayis University and to examine the results of multiple linear regression model based on the results obtained from Maslach Burnout Scale with Jackknife Method in terms of validity and generalizability. To do this, a questionnaire was given to 11 research assistants working at Ondokuz Mayis University and the burnout scores of this questionnaire were taken as the dependent variable of the multiple linear regression model. The variable of burnout was explained with the variables of age, weekly hours of classes taught, monthly average credit card debt, numbers of published articles and reports, gender, marital status, number of children and the departments of the research assistants. Dummy variables were assigned to the variables of gender, marital status, number of children and the departments of the research assistants and thus, they were made quantitative. The significance of the model as a result of multiple linear regressions was examined through backward elimination method. After this, for the five explanatory variables which influenced the variable of burnout, standardized model coefficients and coefficients of determination, and 95% confidence intervals of these values were estimated through Jackknife Method and the generalizability of the parameter estimation results of these variables on population was researched.
文摘In this research, the result of the cloud seeding over Yazd province during three months of February, March and April in 1999 has been evaluated using the historical regression method. Hereupon, the rain-gages in Yazd province as the target stations and the rain-gages of the neighboring provinces as the control stations have been selected. The rainfall averages for the three aforementioned months through 25 years (1973-1997) in all control and target stations have been calculated. In the next step, the correlations between the rainfalls of control and target stations have been estimated about 75%, which indicates a good consistency in order to use the historical regression. Then, through the obtained liner correlation equation between the control and target stations the precipitation amount for February, March and April in 1999, over the target region (Yazd province) was estimated about 27.57 mm, whiles the observed amount was 34.23 mm. In fact the precipitation increasing around 19.5% over Yazd province confirmed the success of this cloud seeding project.
文摘In this paper, we propose a Fast Iteration Method for solving mixture regression problem, which can be treated as a model-based clustering. Compared to the EM algorithm, the proposed method is faster, more flexible and can solve mixture regression problem with different error distributions (i.e. Laplace and t distribution). Extensive numeric experiments show that our proposed method has better performance on randomly simulations and real data.
文摘With determination micro-Fe by 1, 10-phenanthroline spectrophotometry for example, they are systematically introduced the combinatorial measurement and regression analysis method application about metheodic principle, operation step and data processing in the instrumental analysis, including: calibration curve best linear equation is set up, measurand best linear equation is set up, and calculation of best value of a concentration. The results showed that mean of thrice determination , s = 0 μg/mL, RSD = 0. Results of preliminary application are simply introduced in the basic instrumental analysis for atomic absorption spectrophotometry, ion-selective electrodes, coulometry and polarographic analysis and are contrasted to results of normal measurements.
文摘This paper presents a technique for Medium Term Load Forecasting (MTLF) using Particle Swarm Optimization (PSO) algorithm based on Least Squares Regression Methods to forecast the electric loads of the Jordanian grid for year of 2015. Linear, quadratic and exponential forecast models have been examined to perform this study and compared with the Auto Regressive (AR) model. MTLF models were influenced by the weather which should be considered when predicting the future peak load demand in terms of months and weeks. The main contribution for this paper is the conduction of MTLF study for Jordan on weekly and monthly basis using real data obtained from National Electric Power Company NEPCO. This study is aimed to develop practical models and algorithm techniques for MTLF to be used by the operators of Jordan power grid. The results are compared with the actual peak load data to attain minimum percentage error. The value of the forecasted weekly and monthly peak loads obtained from these models is examined using Least Square Error (LSE). Actual reported data from NEPCO are used to analyze the performance of the proposed approach and the results are reported and compared with the results obtained from PSO algorithm and AR model.
文摘Heteroscedasticity and multicollinearity are serious problems when they exist in econometrics data. These problems exist as a result of violating the assumptions of equal variance between the error terms and that of independence between the explanatory variables of the model. With these assumption violations, Ordinary Least Square Estimator</span><span style="font-family:""> </span><span style="font-family:""><span style="font-family:Verdana;">(OLS) will not give best linear unbiased, efficient and consistent estimator. In practice, there are several structures of heteroscedasticity and several methods of heteroscedasticity detection. For better estimation result, best heteroscedasticity detection methods must be determined for any structure of heteroscedasticity in the presence of multicollinearity between the explanatory variables of the model. In this paper we examine the effects of multicollinearity on type I error rates of some methods of heteroscedasticity detection in linear regression model in other to determine the best method of heteroscedasticity detection to use when both problems exist in the model. Nine heteroscedasticity detection methods were considered with seven heteroscedasticity structures. Simulation study was done via a Monte Carlo experiment on a multiple linear regression model with 3 explanatory variables. This experiment was conducted 1000 times with linear model parameters of </span><span style="white-space:nowrap;"><em><span style="font-family:Verdana;">β</span></em><sub><span style="font-family:Verdana;">0</span></sub><span style="font-family:Verdana;"> = 4 , </span><em><span style="font-family:Verdana;">β</span></em><sub><span style="font-family:Verdana;">1</span></sub><span style="font-family:Verdana;"> = 0.4 , </span><em><span style="font-family:Verdana;">β</span></em><sub><span style="font-family:Verdana;">2</span></sub><span style="font-family:Verdana;">= 1.5</span></span></span><span style="font-family:""><span style="font-family:Verdana;"> and </span><em style="font-family:""><span style="font-family:Verdana;">β</span><span style="font-family:Verdana;"><sub>3 </sub></span></em><span style="font-family:Verdana;">= 3.6</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">Five (5) </span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;">levels of</span><span style="white-space:nowrap;font-family:Verdana;"> </span><span style="font-family:Verdana;"></span><span style="font-family:Verdana;">mulicollinearity </span></span><span style="font-family:Verdana;">are </span><span style="font-family:Verdana;">with seven</span><span style="font-family:""> </span><span style="font-family:Verdana;">(7) different sample sizes. The method’s performances were compared with the aids of set confidence interval (C.I</span><span style="font-family:Verdana;">.</span><span style="font-family:Verdana;">) criterion. Results showed that whenever multicollinearity exists in the model with any forms of heteroscedasticity structures, Breusch-Godfrey (BG) test is the best method to determine the existence of heteroscedasticity at all chosen levels of significance.
基金The financial support provided from the Deanship of Scientific Research at King SaudUniversity,Research group No.RG-1441-502.
文摘The fast spread of coronavirus disease(COVID-19)caused by SARSCoV-2 has become a pandemic and a serious threat to the world.As of May 30,2020,this disease had infected more than 6 million people globally,with hundreds of thousands of deaths.Therefore,there is an urgent need to predict confirmed cases so as to analyze the impact of COVID-19 and practice readiness in healthcare systems.This study uses gradient boosting regression(GBR)to build a trained model to predict the daily total confirmed cases of COVID-19.The GBR method can minimize the loss function of the training process and create a single strong learner from weak learners.Experiments are conducted on a dataset of daily confirmed COVID-19 cases from January 22,2020,to May 30,2020.The results are evaluated on a set of evaluation performance measures using 10-fold cross-validation to demonstrate the effectiveness of the GBR method.The results reveal that the GBR model achieves 0.00686 root mean square error,the lowest among several comparative models.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.