BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale c...BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale cannot be fully understood due to lack of information.AIM To identify key factors that may explain the variability in case lethality across countries.METHODS We identified 21 Potential risk factors for coronavirus disease 2019(COVID-19)case fatality rate for all the countries with available data.We examined univariate relationships of each variable with case fatality rate(CFR),and all independent variables to identify candidate variables for our final multiple model.Multiple regression analysis technique was used to assess the strength of relationship.RESULTS The mean of COVID-19 mortality was 1.52±1.72%.There was a statistically significant inverse correlation between health expenditure,and number of computed tomography scanners per 1 million with CFR,and significant direct correlation was found between literacy,and air pollution with CFR.This final model can predict approximately 97%of the changes in CFR.CONCLUSION The current study recommends some new predictors explaining affect mortality rate.Thus,it could help decision-makers develop health policies to fight COVID-19.展开更多
Machine learning(ML)has powerful nonlinear processing and multivariate learning capabilities,so it has been widely utilised in the fatigue field.However,most ML methods are inexplicable black-box models that are diffi...Machine learning(ML)has powerful nonlinear processing and multivariate learning capabilities,so it has been widely utilised in the fatigue field.However,most ML methods are inexplicable black-box models that are difficult to apply in engineering practice.Symbolic regression(SR)is an interpretable machine learning method for determining the optimal fitting equation for datasets.In this study,domain knowledge-guided SR was used to determine a new fatigue crack growth(FCG)rate model.Three terms of the variable subtree ofΔK,R-ratio,andΔK_(th)were obtained by analysing eight traditional semi-empirical FCG rate models.Based on the FCG rate test data from other literature,the SR model was constructed using Al-7055-T7511.It was subsequently extended to other alloys(Ti-10V-2Fe-3Al,Ti-6Al-4V,Cr-Mo-V,LC9cs,Al-6013-T651,and Al-2324-T3)using multiple linear regression.Compared with the three semi-empirical FCG rate models,the SR model yielded higher prediction accuracy.This result demonstrates the potential of domain knowledge-guided SR for building the FCG rate model.展开更多
In this paper, we study the strong consistency and convergence rate for modified partitioning estimation of regression function under samples that are ψ-mixing with identically distribution.
To analyze the factors affecting the leakage rate of water distribution system, we built a macroscopic "leakage rate–leakage factors"(LRLF) model. In this model, we consider the pipe attributes(quality, dia...To analyze the factors affecting the leakage rate of water distribution system, we built a macroscopic "leakage rate–leakage factors"(LRLF) model. In this model, we consider the pipe attributes(quality, diameter,age), maintenance cost, valve replacement cost, and annual average pressure. Based on variable selection and principal component analysis results, we extracted three main principle components—the pipe attribute principal component(PAPC), operation management principal component, and water pressure principal component. Of these, we found PAPC to have the most influence. Using principal component regression, we established an LRLF model with no detectable serial correlations. The adjusted R2 and RMSE values of the model were 0.717 and 2.067, respectively.This model represents a potentially useful tool for controlling leakage rate from the macroscopic viewpoint.展开更多
Regression models for survival time data involve estimation of the hazard rate as a function of predictor variables and associated slope parameters. An adaptive approach is formulated for such hazard regression modeli...Regression models for survival time data involve estimation of the hazard rate as a function of predictor variables and associated slope parameters. An adaptive approach is formulated for such hazard regression modeling. The hazard rate is modeled using fractional polynomials, that is, linear combinations of products of power transforms of time together with other available predictors. These fractional polynomial models are restricted to generating positive-valued hazard rates and decreasing survival times. Exponentially distributed survival times are a special case. Parameters are estimated using maximum likelihood estimation allowing for right censored survival times. Models are evaluated and compared using likelihood cross-validation (LCV) scores. LCV scores and tolerance parameters are used to control an adaptive search through alternative fractional polynomial hazard rate models to identify effective models for the underlying survival time data. These methods are demonstrated using two different survival time data sets including survival times for lung cancer patients and for multiple myeloma patients. For the lung cancer data, the hazard rate depends distinctly on time. However, controlling for cell type provides a distinct improvement while the hazard rate depends only on cell type and no longer on time. Furthermore, Cox regression is unable to identify a cell type effect. For the multiple myeloma data, the hazard rate also depends distinctly on time. Moreover, consideration of hemoglobin at diagnosis provides a distinct improvement, the hazard rate still depends distinctly on time, and hemoglobin distinctly moderates the effect of time on the hazard rate. These results indicate that adaptive hazard rate modeling can provide unique insights into survival time data.展开更多
Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods trea...Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods treat covariates, either time-invariant or time-varying, as having multiplicative effects while general dependence on time is left un-estimated. An adaptive approach is formulated for analyzing multiple event time data. Conditional hazard rates are modeled in terms of dependence on both time and covariates using fractional polynomials restricted so that the conditional hazard rates are positive-valued and so that excess time probability functions (generalizing survival functions for single event times) are decreasing. Maximum likelihood is used to estimate parameters adjusting for right censored event times. Likelihood cross-validation (LCV) scores are used to compare models. Adaptive searches through alternate conditional hazard rate models are controlled by LCV scores combined with tolerance parameters. These searches identify effective models for the underlying multiple event time data. Conditional hazard regression is demonstrated using data on times between tumor recurrence for bladder cancer patients. Analyses of theory-based models for these data using extensions of Cox regression provide conflicting results on effects to treatment group and the initial number of tumors. On the other hand, fractional polynomial analyses of these theory-based models provide consistent results identifying significant effects to treatment group and initial number of tumors using both model-based and robust empirical tests. Adaptive analyses further identify distinct moderation by group of the effect of tumor order and an additive effect to group after controlling for nonlinear effects to initial number of tumors and tumor order. Results of example analyses indicate that adaptive conditional hazard rate modeling can generate useful insights into multiple event time data.展开更多
Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods trea...Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods treat covariates, either time-invariant or time-varying, as having multiplicative effects while general dependence on time is left un-estimated. An adaptive approach is formulated for analyzing multiple event time data. Conditional hazard rates are modeled in terms of dependence on both time and covariates using fractional polynomials restricted so that the conditional hazard rates are positive-valued and so that excess time probability functions (generalizing survival functions for single event times) are decreasing. Maximum likelihood is used to estimate parameters adjusting for right censored event times. Likelihood cross-validation (LCV) scores are used to compare models. Adaptive searches through alternate conditional hazard rate models are controlled by LCV scores combined with tolerance parameters. These searches identify effective models for the underlying multiple event time data. Conditional hazard regression is demonstrated using data on times between tumor recurrence for bladder cancer patients. Analyses of theory-based models for these data using extensions of Cox regression provide conflicting results on effects to treatment group and the initial number of tumors. On the other hand, fractional polynomial analyses of these theory-based models provide consistent results identifying significant effects to treatment group and initial number of tumors using both model-based and robust empirical tests. Adaptive analyses further identify distinct moderation by group of the effect of tumor order and an additive effect to group after controlling for nonlinear effects to initial number of tumors and tumor order. Results of example analyses indicate that adaptive conditional hazard rate modeling can generate useful insights into multiple event time data.展开更多
During underground coal gasification (UCG), whereby coal is converted to syngas in situ, a cavity is formed in the coal seam. The cavity growth rate (CGR) or the moving rate of the gasification face is affected by...During underground coal gasification (UCG), whereby coal is converted to syngas in situ, a cavity is formed in the coal seam. The cavity growth rate (CGR) or the moving rate of the gasification face is affected by controllable (operation pressure, gasification time, geometry of UCG panel) and uncontrollable (coal seam properties) factors. The CGR is usually predicted by mathematical models and laboratory experiments, which are time consuming, cumbersome and expensive. In this paper, a new simple model for CGR is developed using non-linear regression analysis, based on data from 1 l UCG field trials. The empirical model compares satisfactorily with Perkins model and can reliably predict CGR.展开更多
The technology of tunnel boring machine(TBM)has been widely applied for underground construction worldwide;however,how to ensure the TBM tunneling process safe and efficient remains a major concern.Advance rate is a k...The technology of tunnel boring machine(TBM)has been widely applied for underground construction worldwide;however,how to ensure the TBM tunneling process safe and efficient remains a major concern.Advance rate is a key parameter of TBM operation and reflects the TBM-ground interaction,for which a reliable prediction helps optimize the TBM performance.Here,we develop a hybrid neural network model,called Attention-ResNet-LSTM,for accurate prediction of the TBM advance rate.A database including geological properties and TBM operational parameters from the Yangtze River Natural Gas Pipeline Project is used to train and test this deep learning model.The evolutionary polynomial regression method is adopted to aid the selection of input parameters.The results of numerical exper-iments show that our Attention-ResNet-LSTM model outperforms other commonly-used intelligent models with a lower root mean square error and a lower mean absolute percentage error.Further,parametric analyses are conducted to explore the effects of the sequence length of historical data and the model architecture on the prediction accuracy.A correlation analysis between the input and output parameters is also implemented to provide guidance for adjusting relevant TBM operational parameters.The performance of our hybrid intelligent model is demonstrated in a case study of TBM tunneling through a complex ground with variable strata.Finally,data collected from the Baimang River Tunnel Project in Shenzhen of China are used to further test the generalization of our model.The results indicate that,compared to the conventional ResNet-LSTM model,our model has a better predictive capability for scenarios with unknown datasets due to its self-adaptive characteristic.展开更多
Let (X,Y) be an R^d×R^1 valued random vector (X_1,Y_1),…, (X_n,Y_n) be a random sample drawn from (X,Y), and let E|Y|<∞. The regression function m(x)=E(Y|X=x) for x∈R^d is estimated by where, and h_n is a p...Let (X,Y) be an R^d×R^1 valued random vector (X_1,Y_1),…, (X_n,Y_n) be a random sample drawn from (X,Y), and let E|Y|<∞. The regression function m(x)=E(Y|X=x) for x∈R^d is estimated by where, and h_n is a positive number depending upon n only, nad K is a given nonnegative function on R^d. In the paper, we study the L_p convergence rate of kernel estimate m_n(x) of m(x) in suitable condition, and improve and extend the results of Wei Lansheng.展开更多
This study is the first attempt to investigate the relationship between the annual GDP growth rate and money laundering in the Republic of Albania during the period 2007-2011. The main result of the study: there is a ...This study is the first attempt to investigate the relationship between the annual GDP growth rate and money laundering in the Republic of Albania during the period 2007-2011. The main result of the study: there is a negative correlation between money laundering process and economic growth rate in Albania during the specified period;there is a negative correlation between money laundering and import, but there is a positive correlation between money laundering and the government expenditure, as well a positive correlation between money laundering and export.展开更多
In this paper, we investigate the nonparametric regression model based on ρ-mixing errors, which are stochastically dominated by a nonnegative random variable. Weobtain the convergence rate for the weighted estimator...In this paper, we investigate the nonparametric regression model based on ρ-mixing errors, which are stochastically dominated by a nonnegative random variable. Weobtain the convergence rate for the weighted estimator of unknown function g(x) in pth-mean, which yields the convergence rate in probability. Moreover, an example of the nearestneighbor estimator is also illustrated and the convergence rates of estimator are presented.展开更多
The spatiotemporal distribution characteristics of the regression rate are crucial aspects of the research on Hybrid Rocket Motor(HRM). This study presents a pioneering effort in achieving a comprehensive numerical si...The spatiotemporal distribution characteristics of the regression rate are crucial aspects of the research on Hybrid Rocket Motor(HRM). This study presents a pioneering effort in achieving a comprehensive numerical simulation of fluid dynamics and heat transfer in both the fluid and solid regions throughout the entire operation of an HRM. To accomplish this, a dynamic grid technique that incorporates fluid–solid coupling is utilized. To validate the precision of the numerical simulations, a firing test is conducted, with embedded thermocouple probes being used to measure the inner temperature of the fuel grain. The temperature variations in the solid fuel obtained from both experiment and simulations show good agreement. The maximum combustion temperature and average thrust obtained from the simulations are found to deviate from the experimental results by only 3.3% and 2.4%, respectively. Thus, it can be demonstrated that transient numerical simulations accurately capture the fluid–solid coupling characteristics and transient regression rate. The dynamic simulation results of inner flow field and solid region throughout the entire working stage reveal that the presence of vortices enhances the blending of combustion gases and improves the regression rate at both the front and rear ends of the fuel grain. In addition, oscillations of the regression rate obtained in the simulation can also be well corresponded with the corrugated surface observed in the experiment. Furthermore, the zero-dimension regression rate formula and the formula describing the axial location dependence of the regression rate are fitted from the simulation results, with the corresponding coefficients of determination(R^(2)) of 0.9765 and 0.9298, respectively.This research serves as a reference for predicting the performance of HRM with gas oxygen and polyethylene, and presents a credible way for investigating the spatiotemporal distribution of the regression rate.展开更多
Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ri...Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.展开更多
In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard n...In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard normal distribution. We get the EB estimators by using kernel estimation of multivariate density function and its first order partial derivatives. It is shown that the convergence rates of the EB estimators are under the condition where an integer k > 1 . is an arbitrary small number and m is the dimension of the vector Y.展开更多
It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when th...It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.展开更多
Deformation modulus of rock mass is one of the input parameters to most rock engineering designs and constructions.The field tests for determination of deformation modulus are cumbersome,expensive and time-consuming.T...Deformation modulus of rock mass is one of the input parameters to most rock engineering designs and constructions.The field tests for determination of deformation modulus are cumbersome,expensive and time-consuming.This has prompted the development of various regression equations to estimate deformation modulus from results of rock mass classifications,with rock mass rating(RMR)being one of the frequently used classifications.The regression equations are of different types ranging from linear to nonlinear functions like power and exponential.Bayesian method has recently been developed to incorporate regression equations into a Bayesian framework to provide better estimates of geotechnical properties.The question of whether Bayesian method improves the estimation of geotechnical properties in all circumstances remains open.Therefore,a comparative study was conducted to assess the performances of regression and Bayesian methods when they are used to characterize deformation modulus from the same set of RMR data obtained from two project sites.The study also investigated the performance of different types of regression equations in estimation of the deformation modulus.Statistics,probability distributions and prediction indicators were used to assess the performances of regression and Bayesian methods and different types of regression equations.It was found that power and exponential types of regression equations provide a better estimate than linear regression equations.In addition,it was discovered that the ability of the Bayesian method to provide better estimates of deformation modulus than regression method depends on the quality and quantity of input data as well as the type of the regression equation.展开更多
Purpose: To formulate and demonstrate methods for regression modeling of probabilities and dispersions for individual-patient longitudinal outcomes taking on discrete numeric values. Methods: Three alternatives for mo...Purpose: To formulate and demonstrate methods for regression modeling of probabilities and dispersions for individual-patient longitudinal outcomes taking on discrete numeric values. Methods: Three alternatives for modeling of outcome probabilities are considered. Multinomial probabilities are based on different intercepts and slopes for probabilities of different outcome values. Ordinal probabilities are based on different intercepts and the same slope for probabilities of different outcome values. Censored Poisson probabilities are based on the same intercept and slope for probabilities of different outcome values. Parameters are estimated with extended linear mixed modeling maximizing a likelihood-like function based on the multivariate normal density that accounts for within-patient correlation. Formulas are provided for gradient vectors and Hessian matrices for estimating model parameters. The likelihood-like function is also used to compute cross-validation scores for alternative models and to control an adaptive modeling process for identifying possibly nonlinear functional relationships in predictors for probabilities and dispersions. Example analyses are provided of daily pain ratings for a cancer patient over a period of 97 days. Results: The censored Poisson approach is preferable for modeling these data, and presumably other data sets of this kind, because it generates a competitive model with fewer parameters in less time than the other two approaches. The generated probabilities for this model are distinctly nonlinear in time while the dispersions are distinctly nonconstant over time, demonstrating the need for adaptive modeling of such data. The analyses also address the dependence of these daily pain ratings on time and the daily numbers of pain flares. Probabilities and dispersions change differently over time for different numbers of pain flares. Conclusions: Adaptive modeling of daily pain ratings for individual cancer patients is an effective way to identify nonlinear relationships in time as well as in other predictors such as the number of pain flares.展开更多
The kinetics equation of deposition rate was implemented to help explain some of the mechanisms responsible for structures observed during the deposition of CoFeB films on poly-ester plastic. The plating rate of elect...The kinetics equation of deposition rate was implemented to help explain some of the mechanisms responsible for structures observed during the deposition of CoFeB films on poly-ester plastic. The plating rate of electroless CoFeB films is a function of concentration of sodium tetrahydroborate, pH of the plating bath, plating temperature and the metallic ratio. The estimated regression coefficient, confidence interval, residual error and confidence interval were confirmed by computer program. The optimal composition of the plating bath was obtained and the dynamic electromagnetic parameters of films were measured in the 2-10 GHz range. At 2 GHz, the permeability, magnetic loss of the electroless CoFeB films were 304,76.6 respectively as the concentration of reducer is 1 g·L^-1.展开更多
On the basis of experimental observations on animals, applications to clinical data on patients and theoretical statistical reasoning, the author developed a com-puter-assisted general mathematical model of the ‘prob...On the basis of experimental observations on animals, applications to clinical data on patients and theoretical statistical reasoning, the author developed a com-puter-assisted general mathematical model of the ‘probacent’-probability equation, Equation (1) and death rate (mortality probability) equation, Equation (2) derivable from Equation (1) that may be applica-ble as a general approximation method to make use-ful predictions of probable outcomes in a variety of biomedical phenomena [1-4]. Equations (1) and (2) contain a constant, γ and c, respectively. In the pre-vious studies, the author used the least maximum- difference principle to determine these constants that were expected to best fit reported data, minimizing the deviation. In this study, the author uses the method of computer-assisted least sum of squares to determine the constants, γ and c in constructing the ‘probacent’-related formulas best fitting the NCHS- reported data on survival probabilities and death rates in the US total adult population for 2001. The results of this study reveal that the method of com-puter-assisted mathematical analysis with the least sum of squares seems to be simple, more accurate, convenient and preferable than the previously used least maximum-difference principle, and better fit-ting the NCHS-reported data on survival probabili-ties and death rates in the US total adult population. The computer program of curved regression for the ‘probacent’-probability and death rate equations may be helpful in research in biomedicine.展开更多
文摘BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale cannot be fully understood due to lack of information.AIM To identify key factors that may explain the variability in case lethality across countries.METHODS We identified 21 Potential risk factors for coronavirus disease 2019(COVID-19)case fatality rate for all the countries with available data.We examined univariate relationships of each variable with case fatality rate(CFR),and all independent variables to identify candidate variables for our final multiple model.Multiple regression analysis technique was used to assess the strength of relationship.RESULTS The mean of COVID-19 mortality was 1.52±1.72%.There was a statistically significant inverse correlation between health expenditure,and number of computed tomography scanners per 1 million with CFR,and significant direct correlation was found between literacy,and air pollution with CFR.This final model can predict approximately 97%of the changes in CFR.CONCLUSION The current study recommends some new predictors explaining affect mortality rate.Thus,it could help decision-makers develop health policies to fight COVID-19.
基金Supported by Sichuan Provincial Science and Technology Program(Grant No.2022YFH0075)Opening Project of State Key Laboratory of Performance Monitoring and Protecting of Rail Transit Infrastructure(Grant No.HJGZ2021113)Independent Research Project of State Key Laboratory of Traction Power(Grant No.2022TPL_T03).
文摘Machine learning(ML)has powerful nonlinear processing and multivariate learning capabilities,so it has been widely utilised in the fatigue field.However,most ML methods are inexplicable black-box models that are difficult to apply in engineering practice.Symbolic regression(SR)is an interpretable machine learning method for determining the optimal fitting equation for datasets.In this study,domain knowledge-guided SR was used to determine a new fatigue crack growth(FCG)rate model.Three terms of the variable subtree ofΔK,R-ratio,andΔK_(th)were obtained by analysing eight traditional semi-empirical FCG rate models.Based on the FCG rate test data from other literature,the SR model was constructed using Al-7055-T7511.It was subsequently extended to other alloys(Ti-10V-2Fe-3Al,Ti-6Al-4V,Cr-Mo-V,LC9cs,Al-6013-T651,and Al-2324-T3)using multiple linear regression.Compared with the three semi-empirical FCG rate models,the SR model yielded higher prediction accuracy.This result demonstrates the potential of domain knowledge-guided SR for building the FCG rate model.
基金The Science Research Fundation (041002F) of Hefei University of Technology.
文摘In this paper, we study the strong consistency and convergence rate for modified partitioning estimation of regression function under samples that are ψ-mixing with identically distribution.
基金supported by the Ministry of Science and Technology of China (No.2014ZX07203-009)the Fundamental Research Funds for the Central Universitiesthe Program for New Century Excellent Talents at the University of China
文摘To analyze the factors affecting the leakage rate of water distribution system, we built a macroscopic "leakage rate–leakage factors"(LRLF) model. In this model, we consider the pipe attributes(quality, diameter,age), maintenance cost, valve replacement cost, and annual average pressure. Based on variable selection and principal component analysis results, we extracted three main principle components—the pipe attribute principal component(PAPC), operation management principal component, and water pressure principal component. Of these, we found PAPC to have the most influence. Using principal component regression, we established an LRLF model with no detectable serial correlations. The adjusted R2 and RMSE values of the model were 0.717 and 2.067, respectively.This model represents a potentially useful tool for controlling leakage rate from the macroscopic viewpoint.
文摘Regression models for survival time data involve estimation of the hazard rate as a function of predictor variables and associated slope parameters. An adaptive approach is formulated for such hazard regression modeling. The hazard rate is modeled using fractional polynomials, that is, linear combinations of products of power transforms of time together with other available predictors. These fractional polynomial models are restricted to generating positive-valued hazard rates and decreasing survival times. Exponentially distributed survival times are a special case. Parameters are estimated using maximum likelihood estimation allowing for right censored survival times. Models are evaluated and compared using likelihood cross-validation (LCV) scores. LCV scores and tolerance parameters are used to control an adaptive search through alternative fractional polynomial hazard rate models to identify effective models for the underlying survival time data. These methods are demonstrated using two different survival time data sets including survival times for lung cancer patients and for multiple myeloma patients. For the lung cancer data, the hazard rate depends distinctly on time. However, controlling for cell type provides a distinct improvement while the hazard rate depends only on cell type and no longer on time. Furthermore, Cox regression is unable to identify a cell type effect. For the multiple myeloma data, the hazard rate also depends distinctly on time. Moreover, consideration of hemoglobin at diagnosis provides a distinct improvement, the hazard rate still depends distinctly on time, and hemoglobin distinctly moderates the effect of time on the hazard rate. These results indicate that adaptive hazard rate modeling can provide unique insights into survival time data.
文摘Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods treat covariates, either time-invariant or time-varying, as having multiplicative effects while general dependence on time is left un-estimated. An adaptive approach is formulated for analyzing multiple event time data. Conditional hazard rates are modeled in terms of dependence on both time and covariates using fractional polynomials restricted so that the conditional hazard rates are positive-valued and so that excess time probability functions (generalizing survival functions for single event times) are decreasing. Maximum likelihood is used to estimate parameters adjusting for right censored event times. Likelihood cross-validation (LCV) scores are used to compare models. Adaptive searches through alternate conditional hazard rate models are controlled by LCV scores combined with tolerance parameters. These searches identify effective models for the underlying multiple event time data. Conditional hazard regression is demonstrated using data on times between tumor recurrence for bladder cancer patients. Analyses of theory-based models for these data using extensions of Cox regression provide conflicting results on effects to treatment group and the initial number of tumors. On the other hand, fractional polynomial analyses of these theory-based models provide consistent results identifying significant effects to treatment group and initial number of tumors using both model-based and robust empirical tests. Adaptive analyses further identify distinct moderation by group of the effect of tumor order and an additive effect to group after controlling for nonlinear effects to initial number of tumors and tumor order. Results of example analyses indicate that adaptive conditional hazard rate modeling can generate useful insights into multiple event time data.
文摘Recurrent event time data and more general multiple event time data are commonly analyzed using extensions of Cox regression, or proportional hazards regression, as used with single event time data. These methods treat covariates, either time-invariant or time-varying, as having multiplicative effects while general dependence on time is left un-estimated. An adaptive approach is formulated for analyzing multiple event time data. Conditional hazard rates are modeled in terms of dependence on both time and covariates using fractional polynomials restricted so that the conditional hazard rates are positive-valued and so that excess time probability functions (generalizing survival functions for single event times) are decreasing. Maximum likelihood is used to estimate parameters adjusting for right censored event times. Likelihood cross-validation (LCV) scores are used to compare models. Adaptive searches through alternate conditional hazard rate models are controlled by LCV scores combined with tolerance parameters. These searches identify effective models for the underlying multiple event time data. Conditional hazard regression is demonstrated using data on times between tumor recurrence for bladder cancer patients. Analyses of theory-based models for these data using extensions of Cox regression provide conflicting results on effects to treatment group and the initial number of tumors. On the other hand, fractional polynomial analyses of these theory-based models provide consistent results identifying significant effects to treatment group and initial number of tumors using both model-based and robust empirical tests. Adaptive analyses further identify distinct moderation by group of the effect of tumor order and an additive effect to group after controlling for nonlinear effects to initial number of tumors and tumor order. Results of example analyses indicate that adaptive conditional hazard rate modeling can generate useful insights into multiple event time data.
文摘During underground coal gasification (UCG), whereby coal is converted to syngas in situ, a cavity is formed in the coal seam. The cavity growth rate (CGR) or the moving rate of the gasification face is affected by controllable (operation pressure, gasification time, geometry of UCG panel) and uncontrollable (coal seam properties) factors. The CGR is usually predicted by mathematical models and laboratory experiments, which are time consuming, cumbersome and expensive. In this paper, a new simple model for CGR is developed using non-linear regression analysis, based on data from 1 l UCG field trials. The empirical model compares satisfactorily with Perkins model and can reliably predict CGR.
基金The research was supported by the National Natural Science Foundation of China(Grant No.52008307)the Shanghai Sci-ence and Technology Innovation Program(Grant No.19DZ1201004)The third author would like to acknowledge the funding by the China Postdoctoral Science Foundation(Grant No.2023M732670).
文摘The technology of tunnel boring machine(TBM)has been widely applied for underground construction worldwide;however,how to ensure the TBM tunneling process safe and efficient remains a major concern.Advance rate is a key parameter of TBM operation and reflects the TBM-ground interaction,for which a reliable prediction helps optimize the TBM performance.Here,we develop a hybrid neural network model,called Attention-ResNet-LSTM,for accurate prediction of the TBM advance rate.A database including geological properties and TBM operational parameters from the Yangtze River Natural Gas Pipeline Project is used to train and test this deep learning model.The evolutionary polynomial regression method is adopted to aid the selection of input parameters.The results of numerical exper-iments show that our Attention-ResNet-LSTM model outperforms other commonly-used intelligent models with a lower root mean square error and a lower mean absolute percentage error.Further,parametric analyses are conducted to explore the effects of the sequence length of historical data and the model architecture on the prediction accuracy.A correlation analysis between the input and output parameters is also implemented to provide guidance for adjusting relevant TBM operational parameters.The performance of our hybrid intelligent model is demonstrated in a case study of TBM tunneling through a complex ground with variable strata.Finally,data collected from the Baimang River Tunnel Project in Shenzhen of China are used to further test the generalization of our model.The results indicate that,compared to the conventional ResNet-LSTM model,our model has a better predictive capability for scenarios with unknown datasets due to its self-adaptive characteristic.
文摘Let (X,Y) be an R^d×R^1 valued random vector (X_1,Y_1),…, (X_n,Y_n) be a random sample drawn from (X,Y), and let E|Y|<∞. The regression function m(x)=E(Y|X=x) for x∈R^d is estimated by where, and h_n is a positive number depending upon n only, nad K is a given nonnegative function on R^d. In the paper, we study the L_p convergence rate of kernel estimate m_n(x) of m(x) in suitable condition, and improve and extend the results of Wei Lansheng.
文摘This study is the first attempt to investigate the relationship between the annual GDP growth rate and money laundering in the Republic of Albania during the period 2007-2011. The main result of the study: there is a negative correlation between money laundering process and economic growth rate in Albania during the specified period;there is a negative correlation between money laundering and import, but there is a positive correlation between money laundering and the government expenditure, as well a positive correlation between money laundering and export.
基金Supported by National Natural Science Foundation of China(11426032,11501005)Natural Science Foundation of Anhui Province(1408085QA02,1508085QA01,1508085J06)+5 种基金Provincial Natural Science Research Project of Anhui Colleges(KJ2014A010,KJ2014A020,KJ2015A065)Higher Education Talent Revitalization Project of Anhui Province(2013SQRL005ZD)Quality Engineering Project of Anhui Province(2015jyxm054,2015jyxm057)Students Science Research Training Program of Anhui University(KYXL2014016,KYXL2014013)Applied Teaching Model Curriculum of Anhui University(XJYYKC1401,ZLTS2015052,ZLTS2015053)Doctoral Research Start-up Funds Projects of Anhui University
文摘In this paper, we investigate the nonparametric regression model based on ρ-mixing errors, which are stochastically dominated by a nonnegative random variable. Weobtain the convergence rate for the weighted estimator of unknown function g(x) in pth-mean, which yields the convergence rate in probability. Moreover, an example of the nearestneighbor estimator is also illustrated and the convergence rates of estimator are presented.
基金supported by the National Natural Science Foundation of China (No.U20B2034).
文摘The spatiotemporal distribution characteristics of the regression rate are crucial aspects of the research on Hybrid Rocket Motor(HRM). This study presents a pioneering effort in achieving a comprehensive numerical simulation of fluid dynamics and heat transfer in both the fluid and solid regions throughout the entire operation of an HRM. To accomplish this, a dynamic grid technique that incorporates fluid–solid coupling is utilized. To validate the precision of the numerical simulations, a firing test is conducted, with embedded thermocouple probes being used to measure the inner temperature of the fuel grain. The temperature variations in the solid fuel obtained from both experiment and simulations show good agreement. The maximum combustion temperature and average thrust obtained from the simulations are found to deviate from the experimental results by only 3.3% and 2.4%, respectively. Thus, it can be demonstrated that transient numerical simulations accurately capture the fluid–solid coupling characteristics and transient regression rate. The dynamic simulation results of inner flow field and solid region throughout the entire working stage reveal that the presence of vortices enhances the blending of combustion gases and improves the regression rate at both the front and rear ends of the fuel grain. In addition, oscillations of the regression rate obtained in the simulation can also be well corresponded with the corrugated surface observed in the experiment. Furthermore, the zero-dimension regression rate formula and the formula describing the axial location dependence of the regression rate are fitted from the simulation results, with the corresponding coefficients of determination(R^(2)) of 0.9765 and 0.9298, respectively.This research serves as a reference for predicting the performance of HRM with gas oxygen and polyethylene, and presents a credible way for investigating the spatiotemporal distribution of the regression rate.
基金This paper was financially supported by NSC96-2628-E-366-004-MY2 and NSC96-2628-E-132-001-MY2
文摘Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.
文摘In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard normal distribution. We get the EB estimators by using kernel estimation of multivariate density function and its first order partial derivatives. It is shown that the convergence rates of the EB estimators are under the condition where an integer k > 1 . is an arbitrary small number and m is the dimension of the vector Y.
文摘It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.
文摘Deformation modulus of rock mass is one of the input parameters to most rock engineering designs and constructions.The field tests for determination of deformation modulus are cumbersome,expensive and time-consuming.This has prompted the development of various regression equations to estimate deformation modulus from results of rock mass classifications,with rock mass rating(RMR)being one of the frequently used classifications.The regression equations are of different types ranging from linear to nonlinear functions like power and exponential.Bayesian method has recently been developed to incorporate regression equations into a Bayesian framework to provide better estimates of geotechnical properties.The question of whether Bayesian method improves the estimation of geotechnical properties in all circumstances remains open.Therefore,a comparative study was conducted to assess the performances of regression and Bayesian methods when they are used to characterize deformation modulus from the same set of RMR data obtained from two project sites.The study also investigated the performance of different types of regression equations in estimation of the deformation modulus.Statistics,probability distributions and prediction indicators were used to assess the performances of regression and Bayesian methods and different types of regression equations.It was found that power and exponential types of regression equations provide a better estimate than linear regression equations.In addition,it was discovered that the ability of the Bayesian method to provide better estimates of deformation modulus than regression method depends on the quality and quantity of input data as well as the type of the regression equation.
文摘Purpose: To formulate and demonstrate methods for regression modeling of probabilities and dispersions for individual-patient longitudinal outcomes taking on discrete numeric values. Methods: Three alternatives for modeling of outcome probabilities are considered. Multinomial probabilities are based on different intercepts and slopes for probabilities of different outcome values. Ordinal probabilities are based on different intercepts and the same slope for probabilities of different outcome values. Censored Poisson probabilities are based on the same intercept and slope for probabilities of different outcome values. Parameters are estimated with extended linear mixed modeling maximizing a likelihood-like function based on the multivariate normal density that accounts for within-patient correlation. Formulas are provided for gradient vectors and Hessian matrices for estimating model parameters. The likelihood-like function is also used to compute cross-validation scores for alternative models and to control an adaptive modeling process for identifying possibly nonlinear functional relationships in predictors for probabilities and dispersions. Example analyses are provided of daily pain ratings for a cancer patient over a period of 97 days. Results: The censored Poisson approach is preferable for modeling these data, and presumably other data sets of this kind, because it generates a competitive model with fewer parameters in less time than the other two approaches. The generated probabilities for this model are distinctly nonlinear in time while the dispersions are distinctly nonconstant over time, demonstrating the need for adaptive modeling of such data. The analyses also address the dependence of these daily pain ratings on time and the daily numbers of pain flares. Probabilities and dispersions change differently over time for different numbers of pain flares. Conclusions: Adaptive modeling of daily pain ratings for individual cancer patients is an effective way to identify nonlinear relationships in time as well as in other predictors such as the number of pain flares.
基金the National Natural Science Foundation of China(No.50371029
文摘The kinetics equation of deposition rate was implemented to help explain some of the mechanisms responsible for structures observed during the deposition of CoFeB films on poly-ester plastic. The plating rate of electroless CoFeB films is a function of concentration of sodium tetrahydroborate, pH of the plating bath, plating temperature and the metallic ratio. The estimated regression coefficient, confidence interval, residual error and confidence interval were confirmed by computer program. The optimal composition of the plating bath was obtained and the dynamic electromagnetic parameters of films were measured in the 2-10 GHz range. At 2 GHz, the permeability, magnetic loss of the electroless CoFeB films were 304,76.6 respectively as the concentration of reducer is 1 g·L^-1.
文摘On the basis of experimental observations on animals, applications to clinical data on patients and theoretical statistical reasoning, the author developed a com-puter-assisted general mathematical model of the ‘probacent’-probability equation, Equation (1) and death rate (mortality probability) equation, Equation (2) derivable from Equation (1) that may be applica-ble as a general approximation method to make use-ful predictions of probable outcomes in a variety of biomedical phenomena [1-4]. Equations (1) and (2) contain a constant, γ and c, respectively. In the pre-vious studies, the author used the least maximum- difference principle to determine these constants that were expected to best fit reported data, minimizing the deviation. In this study, the author uses the method of computer-assisted least sum of squares to determine the constants, γ and c in constructing the ‘probacent’-related formulas best fitting the NCHS- reported data on survival probabilities and death rates in the US total adult population for 2001. The results of this study reveal that the method of com-puter-assisted mathematical analysis with the least sum of squares seems to be simple, more accurate, convenient and preferable than the previously used least maximum-difference principle, and better fit-ting the NCHS-reported data on survival probabili-ties and death rates in the US total adult population. The computer program of curved regression for the ‘probacent’-probability and death rate equations may be helpful in research in biomedicine.