The global pandemic,coronavirus disease 2019(COVID-19),has significantly affected tourism,especially in Spain,as it was among the first countries to be affected by the pandemic and is among the world’s biggest touris...The global pandemic,coronavirus disease 2019(COVID-19),has significantly affected tourism,especially in Spain,as it was among the first countries to be affected by the pandemic and is among the world’s biggest tourist destinations.Stock market values are responding to the evolution of the pandemic,especially in the case of tourist companies.Therefore,being able to quantify this relationship allows us to predict the effect of the pandemic on shares in the tourism sector,thereby improving the response to the crisis by policymakers and investors.Accordingly,a dynamic regression model was developed to predict the behavior of shares in the Spanish tourism sector according to the evolution of the COVID-19 pandemic in the medium term.It has been confirmed that both the number of deaths and cases are good predictors of abnormal stock prices in the tourism sector.展开更多
Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a n...Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a nonlinear random coefficient regression(RCR) model with fusing failure time data.Firstly, some interesting natures of parameters estimation based on the nonlinear RCR model are given. Based on these natures,the failure time data can be fused as the prior information reasonably. Specifically, the fixed parameters are calculated by the field degradation data of the evaluated equipment and the prior information of random coefficient is estimated with fusing the failure time data of congeneric equipment. Then, the prior information of the random coefficient is updated online under the Bayesian framework, the probability density function(PDF) of the RUL with considering the limitation of the failure threshold is performed. Finally, two case studies are used for experimental verification. Compared with the traditional Bayesian method, the proposed method can effectively reduce the influence of imperfect prior information and improve the accuracy of RUL prediction.展开更多
In the era of big data,traditional regression models cannot deal with uncertain big data efficiently and accurately.In order to make up for this deficiency,this paper proposes a quantum fuzzy regression model,which us...In the era of big data,traditional regression models cannot deal with uncertain big data efficiently and accurately.In order to make up for this deficiency,this paper proposes a quantum fuzzy regression model,which uses fuzzy theory to describe the uncertainty in big data sets and uses quantum computing to exponentially improve the efficiency of data set preprocessing and parameter estimation.In this paper,data envelopment analysis(DEA)is used to calculate the degree of importance of each data point.Meanwhile,Harrow,Hassidim and Lloyd(HHL)algorithm and quantum swap circuits are used to improve the efficiency of high-dimensional data matrix calculation.The application of the quantum fuzzy regression model to smallscale financial data proves that its accuracy is greatly improved compared with the quantum regression model.Moreover,due to the introduction of quantum computing,the speed of dealing with high-dimensional data matrix has an exponential improvement compared with the fuzzy regression model.The quantum fuzzy regression model proposed in this paper combines the advantages of fuzzy theory and quantum computing which can efficiently calculate high-dimensional data matrix and complete parameter estimation using quantum computing while retaining the uncertainty in big data.Thus,it is a new model for efficient and accurate big data processing in uncertain environments.展开更多
Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the...Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.展开更多
Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model int...Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.展开更多
In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluste...In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the...In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.展开更多
Backgrounds:Evaluating the growth performance of pigs in real-time is laborious and expensive,thus mathematical models based on easily accessible variables are developed.Multiple regression(MR)is the most widely used ...Backgrounds:Evaluating the growth performance of pigs in real-time is laborious and expensive,thus mathematical models based on easily accessible variables are developed.Multiple regression(MR)is the most widely used tool to build prediction models in swine nutrition,while the artificial neural networks(ANN)model is reported to be more accurate than MR model in prediction performance.Therefore,the potential of ANN models in predicting the growth performance of pigs was evaluated and compared with MR models in this study.Results:Body weight(BW),net energy(NE)intake,standardized ileal digestible lysine(SID Lys)intake,and their quadratic terms were selected as input variables to predict ADG and F/G among 10 candidate variables.In the training phase,MR models showed high accuracy in both ADG and F/G prediction(R^(2)_(ADG)=0.929,R^(2)_(F/G)=0.886)while ANN models with 4,6 neurons and radial basis activation function yielded the best performance in ADG and F/G prediction(R^(2)_(ADG)=0.964,R^(2)_(F/G)=0.932).In the testing phase,these ANN models showed better accuracy in ADG prediction(CCC:0.976 vs.0.861,R^(2):0.951 vs.0.584),and F/G prediction(CCC:0.952 vs.0.900,R^(2):0.905 vs.0.821)compared with the MR models.Meanwhile,the“over-fitting”occurred in MR models but not in ANN models.On validation data from the animal trial,ANN models exhibited superiority over MR models in both ADG and F/G prediction(P<0.01).Moreover,the growth stages have a significant effect on the prediction accuracy of the models.Conclusion:Body weight,NE intake and SID Lys intake can be used as input variables to predict the growth performance of growing-finishing pigs,with trained ANN models are more flexible and accurate than MR models.Therefore,it is promising to use ANN models in related swine nutrition studies in the future.展开更多
In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested.This class of models,with the proposed estimation meth...In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested.This class of models,with the proposed estimation method,is a powerful means for exploratory data analysis.展开更多
Wavelets are applied to detect the jumps in a heteroscedastic regression model. It is shown that the wavelet coefficients of the data have significantly large absolute values across fine scale levels near the jump poi...Wavelets are applied to detect the jumps in a heteroscedastic regression model. It is shown that the wavelet coefficients of the data have significantly large absolute values across fine scale levels near the jump points. Then a procedure is developed to estimate the jumps and jump heights. All estimators are proved to be consistent.展开更多
In this article,a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different coefficient functions is defined.First step,by the l...In this article,a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different coefficient functions is defined.First step,by the local linear technique and the averaged method,the initial estimates of the coefficient functions are given.Second step,based on the initial estimates,the efficient estimates of the coefficient functions are proposed by a one-step back-fitting procedure.The efficient estimators share the same asymptotic normalities as the local linear estimators for the functional-coefficient models with a single smoothing variable in different functions.Two simulated examples show that the procedure is effective.展开更多
A geometric framework is proposed for semiparametric nonlinear regression models based on the concept of least favorable curve, introduced by Severini and Wong (1992). The authors use this framework to drive three kin...A geometric framework is proposed for semiparametric nonlinear regression models based on the concept of least favorable curve, introduced by Severini and Wong (1992). The authors use this framework to drive three kinds of improved approximate confidence regions for the parameter and parameter subset in terms of curvatures. The results obtained by Hamilton et al. (1982), Hamilton (1986) and Wei (1994) are extended to semiparametric nonlinear regression models.展开更多
Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ri...Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.展开更多
Recently,many regression models have been presented for prediction of mechanical parameters of rocks regarding to rock index properties.Although statistical analysis is a common method for developing regression models...Recently,many regression models have been presented for prediction of mechanical parameters of rocks regarding to rock index properties.Although statistical analysis is a common method for developing regression models,but still selection of suitable transformation of the independent variables in a regression model is diffcult.In this paper,a genetic algorithm(GA)has been employed as a heuristic search method for selection of best transformation of the independent variables(some index properties of rocks)in regression models for prediction of uniaxial compressive strength(UCS)and modulus of elasticity(E).Firstly,multiple linear regression(MLR)analysis was performed on a data set to establish predictive models.Then,two GA models were developed in which root mean squared error(RMSE)was defned as ftness function.Results have shown that GA models are more precise than MLR models and are able to explain the relation between the intrinsic strength/elasticity properties and index properties of rocks by simple formulation and accepted accuracy.展开更多
The analysis of numerous experimental equations published in the literature reveals awide scatter in the predictions for the static recrystallization kinetics of steels. Thepowers of the deformation variables, strain ...The analysis of numerous experimental equations published in the literature reveals awide scatter in the predictions for the static recrystallization kinetics of steels. Thepowers of the deformation variables, strain and strain rate, similarly as the powerof the grain size vary in these equations. These differences are highlighted and thetypical values are compared between torsion and compression tests. Potential errorsin physical simulation testing are discussed.展开更多
Combining a linear regression and a temperature budget formula, a multivariate regression model is proposed to parameterize and estimate sea surface temperature(SST) cooling induced by tropical cyclones(TCs). Thre...Combining a linear regression and a temperature budget formula, a multivariate regression model is proposed to parameterize and estimate sea surface temperature(SST) cooling induced by tropical cyclones(TCs). Three major dynamic and thermodynamic processes governing the TC-induced SST cooling(SSTC), vertical mixing, upwelling and heat flux, are parameterized empirically using a combination of multiple atmospheric and oceanic variables:sea surface height(SSH), wind speed, wind curl, TC translation speed and surface net heat flux. The regression model fits reasonably well with 10-year statistical observations/reanalysis data obtained from 100 selected TCs in the northwestern Pacific during 2001–2010, with an averaged fitting error of 0.07 and a mean absolute error of 0.72°C between diagnostic and observed SST cooling. The results reveal that the vertical mixing is overall the pre dominant process producing ocean SST cooling, accounting for 55% of the total cooling. The upwelling accounts for 18% of the total cooling and its maximum occurs near the TC center, associated with TC-induced Ekman pumping. The surface heat flux accounts for 26% of the total cooling, and its contribution increases towards the tropics and the continental shelf. The ocean thermal structures, represented by the SSH in the regression model,plays an important role in modulating the SST cooling pattern. The concept of the regression model can be applicable in TC weather prediction models to improve SST parameterization schemes.展开更多
In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard n...In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard normal distribution. We get the EB estimators by using kernel estimation of multivariate density function and its first order partial derivatives. It is shown that the convergence rates of the EB estimators are under the condition where an integer k > 1 . is an arbitrary small number and m is the dimension of the vector Y.展开更多
Drug use (DU), particularly injecting drug use (IDU) has been the main route of transmission and spread of Human Immunodeficiency Virus (HIV)/Acquired Immune Deficiency Syndrome (AIDSJ among injecting drug use...Drug use (DU), particularly injecting drug use (IDU) has been the main route of transmission and spread of Human Immunodeficiency Virus (HIV)/Acquired Immune Deficiency Syndrome (AIDSJ among injecting drug users (IDUs)[1]. Previous studies have proven that needles or cottons sharing during drug injection were major risk factors for HIV/AIDS transmission at the personal level[z4]. Being a social behavioral issue, HIV/AIDS related risk factors should be far beyond the personal level. Therefore, studies on HIV/AIDS related risk factors should focus not only on the individual factors, but also on the association between HIV/AIDS cases and macroscopic-factors, such as economic status, transportation, health care services, etc[1]. The impact of the macroscopic-factors on HIV/AIDS status might be either positive or negative, which are potentially reflected in promoting, delaying or detecting HIV/AIDS epidemics.展开更多
文摘The global pandemic,coronavirus disease 2019(COVID-19),has significantly affected tourism,especially in Spain,as it was among the first countries to be affected by the pandemic and is among the world’s biggest tourist destinations.Stock market values are responding to the evolution of the pandemic,especially in the case of tourist companies.Therefore,being able to quantify this relationship allows us to predict the effect of the pandemic on shares in the tourism sector,thereby improving the response to the crisis by policymakers and investors.Accordingly,a dynamic regression model was developed to predict the behavior of shares in the Spanish tourism sector according to the evolution of the COVID-19 pandemic in the medium term.It has been confirmed that both the number of deaths and cases are good predictors of abnormal stock prices in the tourism sector.
基金supported by National Natural Science Foundation of China (61703410,61873175,62073336,61873273,61773386,61922089)。
文摘Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a nonlinear random coefficient regression(RCR) model with fusing failure time data.Firstly, some interesting natures of parameters estimation based on the nonlinear RCR model are given. Based on these natures,the failure time data can be fused as the prior information reasonably. Specifically, the fixed parameters are calculated by the field degradation data of the evaluated equipment and the prior information of random coefficient is estimated with fusing the failure time data of congeneric equipment. Then, the prior information of the random coefficient is updated online under the Bayesian framework, the probability density function(PDF) of the RUL with considering the limitation of the failure threshold is performed. Finally, two case studies are used for experimental verification. Compared with the traditional Bayesian method, the proposed method can effectively reduce the influence of imperfect prior information and improve the accuracy of RUL prediction.
基金This work is supported by the NationalNatural Science Foundation of China(No.62076042)the Key Research and Development Project of Sichuan Province(Nos.2021YFSY0012,2020YFG0307,2021YFG0332)+3 种基金the Science and Technology Innovation Project of Sichuan(No.2020017)the Key Research and Development Project of Chengdu(No.2019-YF05-02028-GX)the Innovation Team of Quantum Security Communication of Sichuan Province(No.17TD0009)the Academic and Technical Leaders Training Funding Support Projects of Sichuan Province(No.2016120080102643).
文摘In the era of big data,traditional regression models cannot deal with uncertain big data efficiently and accurately.In order to make up for this deficiency,this paper proposes a quantum fuzzy regression model,which uses fuzzy theory to describe the uncertainty in big data sets and uses quantum computing to exponentially improve the efficiency of data set preprocessing and parameter estimation.In this paper,data envelopment analysis(DEA)is used to calculate the degree of importance of each data point.Meanwhile,Harrow,Hassidim and Lloyd(HHL)algorithm and quantum swap circuits are used to improve the efficiency of high-dimensional data matrix calculation.The application of the quantum fuzzy regression model to smallscale financial data proves that its accuracy is greatly improved compared with the quantum regression model.Moreover,due to the introduction of quantum computing,the speed of dealing with high-dimensional data matrix has an exponential improvement compared with the fuzzy regression model.The quantum fuzzy regression model proposed in this paper combines the advantages of fuzzy theory and quantum computing which can efficiently calculate high-dimensional data matrix and complete parameter estimation using quantum computing while retaining the uncertainty in big data.Thus,it is a new model for efficient and accurate big data processing in uncertain environments.
文摘Cyber losses in terms of number of records breached under cyber incidents commonly feature a significant portion of zeros, specific characteristics of mid-range losses and large losses, which make it hard to model the whole range of the losses using a standard loss distribution. We tackle this modeling problem by proposing a three-component spliced regression model that can simultaneously model zeros, moderate and large losses and consider heterogeneous effects in mixture components. To apply our proposed model to Privacy Right Clearinghouse (PRC) data breach chronology, we segment geographical groups using unsupervised cluster analysis, and utilize a covariate-dependent probability to model zero losses, finite mixture distributions for moderate body and an extreme value distribution for large losses capturing the heavy-tailed nature of the loss data. Parameters and coefficients are estimated using the Expectation-Maximization (EM) algorithm. Combining with our frequency model (generalized linear mixed model) for data breaches, aggregate loss distributions are investigated and applications on cyber insurance pricing and risk management are discussed.
基金This work was supported by the 2021 Project of the“14th Five-Year Plan”of Shaanxi Education Science“Research on the Application of Educational Data Mining in Applied Undergraduate Teaching-Taking the Course of‘Computer Application Technology’as an Example”(SGH21Y0403)the Teaching Reform and Research Projects for Practical Teaching in 2022“Research on Practical Teaching of Applied Undergraduate Projects Based on‘Combination of Courses and Certificates”-Taking Computer Application Technology Courses as an Example”(SJJG02012)the 11th batch of Teaching Reform Research Project of Xi’an Jiaotong University City College“Project-Driven Cultivation and Research on Information Literacy of Applied Undergraduate Students in the Information Times-Taking Computer Application Technology Course Teaching as an Example”(111001).
文摘Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.
文摘In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金Project supported by the National Natural Science Foundation of China (Grant No 60573065)the Natural Science Foundation of Shandong Province,China (Grant No Y2007G33)the Key Subject Research Foundation of Shandong Province,China(Grant No XTD0708)
文摘In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.
基金funded by the National Natural Science Foundation of China(32072764, 31702121)the 2115 Talent Development Program of China Agricultural UniversityNational Key Research and Development Program of China (2019YFD1002605)
文摘Backgrounds:Evaluating the growth performance of pigs in real-time is laborious and expensive,thus mathematical models based on easily accessible variables are developed.Multiple regression(MR)is the most widely used tool to build prediction models in swine nutrition,while the artificial neural networks(ANN)model is reported to be more accurate than MR model in prediction performance.Therefore,the potential of ANN models in predicting the growth performance of pigs was evaluated and compared with MR models in this study.Results:Body weight(BW),net energy(NE)intake,standardized ileal digestible lysine(SID Lys)intake,and their quadratic terms were selected as input variables to predict ADG and F/G among 10 candidate variables.In the training phase,MR models showed high accuracy in both ADG and F/G prediction(R^(2)_(ADG)=0.929,R^(2)_(F/G)=0.886)while ANN models with 4,6 neurons and radial basis activation function yielded the best performance in ADG and F/G prediction(R^(2)_(ADG)=0.964,R^(2)_(F/G)=0.932).In the testing phase,these ANN models showed better accuracy in ADG prediction(CCC:0.976 vs.0.861,R^(2):0.951 vs.0.584),and F/G prediction(CCC:0.952 vs.0.900,R^(2):0.905 vs.0.821)compared with the MR models.Meanwhile,the“over-fitting”occurred in MR models but not in ANN models.On validation data from the animal trial,ANN models exhibited superiority over MR models in both ADG and F/G prediction(P<0.01).Moreover,the growth stages have a significant effect on the prediction accuracy of the models.Conclusion:Body weight,NE intake and SID Lys intake can be used as input variables to predict the growth performance of growing-finishing pigs,with trained ANN models are more flexible and accurate than MR models.Therefore,it is promising to use ANN models in related swine nutrition studies in the future.
文摘In this paper,a class of functional-coefficient regression models is proposed and an estimation procedure based on the locally weighted least equares is suggested.This class of models,with the proposed estimation method,is a powerful means for exploratory data analysis.
文摘Wavelets are applied to detect the jumps in a heteroscedastic regression model. It is shown that the wavelet coefficients of the data have significantly large absolute values across fine scale levels near the jump points. Then a procedure is developed to estimate the jumps and jump heights. All estimators are proved to be consistent.
文摘In this article,a procedure for estimating the coefficient functions on the functional-coefficient regression models with different smoothing variables in different coefficient functions is defined.First step,by the local linear technique and the averaged method,the initial estimates of the coefficient functions are given.Second step,based on the initial estimates,the efficient estimates of the coefficient functions are proposed by a one-step back-fitting procedure.The efficient estimators share the same asymptotic normalities as the local linear estimators for the functional-coefficient models with a single smoothing variable in different functions.Two simulated examples show that the procedure is effective.
文摘A geometric framework is proposed for semiparametric nonlinear regression models based on the concept of least favorable curve, introduced by Severini and Wong (1992). The authors use this framework to drive three kinds of improved approximate confidence regions for the parameter and parameter subset in terms of curvatures. The results obtained by Hamilton et al. (1982), Hamilton (1986) and Wei (1994) are extended to semiparametric nonlinear regression models.
基金This paper was financially supported by NSC96-2628-E-366-004-MY2 and NSC96-2628-E-132-001-MY2
文摘Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.
文摘Recently,many regression models have been presented for prediction of mechanical parameters of rocks regarding to rock index properties.Although statistical analysis is a common method for developing regression models,but still selection of suitable transformation of the independent variables in a regression model is diffcult.In this paper,a genetic algorithm(GA)has been employed as a heuristic search method for selection of best transformation of the independent variables(some index properties of rocks)in regression models for prediction of uniaxial compressive strength(UCS)and modulus of elasticity(E).Firstly,multiple linear regression(MLR)analysis was performed on a data set to establish predictive models.Then,two GA models were developed in which root mean squared error(RMSE)was defned as ftness function.Results have shown that GA models are more precise than MLR models and are able to explain the relation between the intrinsic strength/elasticity properties and index properties of rocks by simple formulation and accepted accuracy.
文摘The analysis of numerous experimental equations published in the literature reveals awide scatter in the predictions for the static recrystallization kinetics of steels. Thepowers of the deformation variables, strain and strain rate, similarly as the powerof the grain size vary in these equations. These differences are highlighted and thetypical values are compared between torsion and compression tests. Potential errorsin physical simulation testing are discussed.
基金The Major National Basic Research Development Program of China under contract No.2016YFA0202704the National Natural Science Foundation of China under contract Nos 41476008 and 41576018+1 种基金the Basic Fund of Chinese Academy of Meteorological Sciences under contract No.2017Z017the Strategic Priority Research Program of the Chinese Academy of Sciences under contract No.XDA11010303
文摘Combining a linear regression and a temperature budget formula, a multivariate regression model is proposed to parameterize and estimate sea surface temperature(SST) cooling induced by tropical cyclones(TCs). Three major dynamic and thermodynamic processes governing the TC-induced SST cooling(SSTC), vertical mixing, upwelling and heat flux, are parameterized empirically using a combination of multiple atmospheric and oceanic variables:sea surface height(SSH), wind speed, wind curl, TC translation speed and surface net heat flux. The regression model fits reasonably well with 10-year statistical observations/reanalysis data obtained from 100 selected TCs in the northwestern Pacific during 2001–2010, with an averaged fitting error of 0.07 and a mean absolute error of 0.72°C between diagnostic and observed SST cooling. The results reveal that the vertical mixing is overall the pre dominant process producing ocean SST cooling, accounting for 55% of the total cooling. The upwelling accounts for 18% of the total cooling and its maximum occurs near the TC center, associated with TC-induced Ekman pumping. The surface heat flux accounts for 26% of the total cooling, and its contribution increases towards the tropics and the continental shelf. The ocean thermal structures, represented by the SSH in the regression model,plays an important role in modulating the SST cooling pattern. The concept of the regression model can be applicable in TC weather prediction models to improve SST parameterization schemes.
文摘In this paper we consider the empirical Bayes (EB) estimation problem for estimable function of regression coefficient in a multiple linear regression model Y=Xβ+e. where e with given β has a multivariate standard normal distribution. We get the EB estimators by using kernel estimation of multivariate density function and its first order partial derivatives. It is shown that the convergence rates of the EB estimators are under the condition where an integer k > 1 . is an arbitrary small number and m is the dimension of the vector Y.
基金supported by the National Scientific Research Mega-Project under the 12th Five-Year Plan of China(2012ZX10001001)
文摘Drug use (DU), particularly injecting drug use (IDU) has been the main route of transmission and spread of Human Immunodeficiency Virus (HIV)/Acquired Immune Deficiency Syndrome (AIDSJ among injecting drug users (IDUs)[1]. Previous studies have proven that needles or cottons sharing during drug injection were major risk factors for HIV/AIDS transmission at the personal level[z4]. Being a social behavioral issue, HIV/AIDS related risk factors should be far beyond the personal level. Therefore, studies on HIV/AIDS related risk factors should focus not only on the individual factors, but also on the association between HIV/AIDS cases and macroscopic-factors, such as economic status, transportation, health care services, etc[1]. The impact of the macroscopic-factors on HIV/AIDS status might be either positive or negative, which are potentially reflected in promoting, delaying or detecting HIV/AIDS epidemics.