Stable water isotopes are natural tracers quantifying the contribution of moisture recycling to local precipitation,i.e.,the moisture recycling ratio,but various isotope-based models usually lead to different results,...Stable water isotopes are natural tracers quantifying the contribution of moisture recycling to local precipitation,i.e.,the moisture recycling ratio,but various isotope-based models usually lead to different results,which affects the accuracy of local moisture recycling.In this study,a total of 18 stations from four typical areas in China were selected to compare the performance of isotope-based linear and Bayesian mixing models and to determine local moisture recycling ratio.Among the three vapor sources including advection,transpiration,and surface evaporation,the advection vapor usually played a dominant role,and the contribution of surface evaporation was less than that of transpiration.When the abnormal values were ignored,the arithmetic averages of differences between isotope-based linear and the Bayesian mixing models were 0.9%for transpiration,0.2%for surface evaporation,and–1.1%for advection,respectively,and the medians were 0.5%,0.2%,and–0.8%,respectively.The importance of transpiration was slightly less for most cases when the Bayesian mixing model was applied,and the contribution of advection was relatively larger.The Bayesian mixing model was found to perform better in determining an efficient solution since linear model sometimes resulted in negative contribution ratios.Sensitivity test with two isotope scenarios indicated that the Bayesian model had a relatively low sensitivity to the changes in isotope input,and it was important to accurately estimate the isotopes in precipitation vapor.Generally,the Bayesian mixing model should be recommended instead of a linear model.The findings are useful for understanding the performance of isotope-based linear and Bayesian mixing models under various climate backgrounds.展开更多
In this paper, the nonlinear free vibration behaviors of the piezoelectric semiconductor(PS) doubly-curved shell resting on the Pasternak foundation are studied within the framework of the nonlinear drift-diffusion(NL...In this paper, the nonlinear free vibration behaviors of the piezoelectric semiconductor(PS) doubly-curved shell resting on the Pasternak foundation are studied within the framework of the nonlinear drift-diffusion(NLDD) model and the first-order shear deformation theory. The nonlinear constitutive relations are presented, and the strain energy, kinetic energy, and virtual work of the PS doubly-curved shell are derived.Based on Hamilton's principle as well as the condition of charge continuity, the nonlinear governing equations are achieved, and then these equations are solved by means of an efficient iteration method. Several numerical examples are given to show the effect of the nonlinear drift current, elastic foundation parameters as well as geometric parameters on the nonlinear vibration frequency, and the damping characteristic of the PS doublycurved shell. The main innovations of the manuscript are that the difference between the linearized drift-diffusion(LDD) model and the NLDD model is revealed, and an effective method is proposed to select a proper initial electron concentration for the LDD model.展开更多
In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining ...In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.展开更多
Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model int...Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.展开更多
Weighted total least squares(WTLS)have been regarded as the standard tool for the errors-in-variables(EIV)model in which all the elements in the observation vector and the coefficient matrix are contaminated with rand...Weighted total least squares(WTLS)have been regarded as the standard tool for the errors-in-variables(EIV)model in which all the elements in the observation vector and the coefficient matrix are contaminated with random errors.However,in many geodetic applications,some elements are error-free and some random observations appear repeatedly in different positions in the augmented coefficient matrix.It is called the linear structured EIV(LSEIV)model.Two kinds of methods are proposed for the LSEIV model from functional and stochastic modifications.On the one hand,the functional part of the LSEIV model is modified into the errors-in-observations(EIO)model.On the other hand,the stochastic model is modified by applying the Moore-Penrose inverse of the cofactor matrix.The algorithms are derived through the Lagrange multipliers method and linear approximation.The estimation principles and iterative formula of the parameters are proven to be consistent.The first-order approximate variance-covariance matrix(VCM)of the parameters is also derived.A numerical example is given to compare the performances of our proposed three algorithms with the STLS approach.Afterwards,the least squares(LS),total least squares(TLS)and linear structured weighted total least squares(LSWTLS)solutions are compared and the accuracy evaluation formula is proven to be feasible and effective.Finally,the LSWTLS is applied to the field of deformation analysis,which yields a better result than the traditional LS and TLS estimations.展开更多
The diameter distribution function(DDF)is a crucial tool for accurately predicting stand carbon storage(CS).The current key issue,however,is how to construct a high-precision DDF based on stand factors,site quality,an...The diameter distribution function(DDF)is a crucial tool for accurately predicting stand carbon storage(CS).The current key issue,however,is how to construct a high-precision DDF based on stand factors,site quality,and aridity index to predict stand CS in multi-species mixed forests with complex structures.This study used data from70 survey plots for mixed broadleaf Populus davidiana and Betula platyphylla forests in the Mulan Rangeland State Forest,Hebei Province,China,to construct the DDF based on maximum likelihood estimation and finite mixture model(FMM).Ordinary least squares(OLS),linear seemingly unrelated regression(LSUR),and back propagation neural network(BPNN)were used to investigate the influences of stand factors,site quality,and aridity index on the shape and scale parameters of DDF and predicted stand CS of mixed broadleaf forests.The results showed that FMM accurately described the stand-level diameter distribution of the mixed P.davidiana and B.platyphylla forests;whereas the Weibull function constructed by MLE was more accurate in describing species-level diameter distribution.The combined variable of quadratic mean diameter(Dq),stand basal area(BA),and site quality improved the accuracy of the shape parameter models of FMM;the combined variable of Dq,BA,and De Martonne aridity index improved the accuracy of the scale parameter models.Compared to OLS and LSUR,the BPNN had higher accuracy in the re-parameterization process of FMM.OLS,LSUR,and BPNN overestimated the CS of P.davidiana but underestimated the CS of B.platyphylla in the large diameter classes(DBH≥18 cm).BPNN accurately estimated stand-and species-level CS,but it was more suitable for estimating stand-level CS compared to species-level CS,thereby providing a scientific basis for the optimization of stand structure and assessment of carbon sequestration capacity in mixed broadleaf forests.展开更多
Effort estimation plays a crucial role in software development projects,aiding in resource allocation,project planning,and risk management.Traditional estimation techniques often struggle to provide accurate estimates...Effort estimation plays a crucial role in software development projects,aiding in resource allocation,project planning,and risk management.Traditional estimation techniques often struggle to provide accurate estimates due to the complex nature of software projects.In recent years,machine learning approaches have shown promise in improving the accuracy of effort estimation models.This study proposes a hybrid model that combines Long Short-Term Memory(LSTM)and Random Forest(RF)algorithms to enhance software effort estimation.The proposed hybrid model takes advantage of the strengths of both LSTM and RF algorithms.To evaluate the performance of the hybrid model,an extensive set of software development projects is used as the experimental dataset.The experimental results demonstrate that the proposed hybrid model outperforms traditional estimation techniques in terms of accuracy and reliability.The integration of LSTM and RF enables the model to efficiently capture temporal dependencies and non-linear interactions in the software development data.The hybrid model enhances estimation accuracy,enabling project managers and stakeholders to make more precise predictions of effort needed for upcoming software projects.展开更多
Adaptive fractional polynomial modeling of general correlated outcomes is formulated to address nonlinearity in means, variances/dispersions, and correlations. Means and variances/dispersions are modeled using general...Adaptive fractional polynomial modeling of general correlated outcomes is formulated to address nonlinearity in means, variances/dispersions, and correlations. Means and variances/dispersions are modeled using generalized linear models in fixed effects/coefficients. Correlations are modeled using random effects/coefficients. Nonlinearity is addressed using power transforms of primary (untransformed) predictors. Parameter estimation is based on extended linear mixed modeling generalizing both generalized estimating equations and linear mixed modeling. Models are evaluated using likelihood cross-validation (LCV) scores and are generated adaptively using a heuristic search controlled by LCV scores. Cases covered include linear, Poisson, logistic, exponential, and discrete regression of correlated continuous, count/rate, dichotomous, positive continuous, and discrete numeric outcomes treated as normally, Poisson, Bernoulli, exponentially, and discrete numerically distributed, respectively. Example analyses are also generated for these five cases to compare adaptive random effects/coefficients modeling of correlated outcomes to previously developed adaptive modeling based on directly specified covariance structures. Adaptive random effects/coefficients modeling substantially outperforms direct covariance modeling in the linear, exponential, and discrete regression example analyses. It generates equivalent results in the logistic regression example analyses and it is substantially outperformed in the Poisson regression case. Random effects/coefficients modeling of correlated outcomes can provide substantial improvements in model selection compared to directly specified covariance modeling. However, directly specified covariance modeling can generate competitive or substantially better results in some cases while usually requiring less computation time.展开更多
Sunshine duration (S) based empirical equations have been employed in this study to estimate the daily global solar radiation on a horizontal surface (G) for six meteorological stations in Burundi. Those equations inc...Sunshine duration (S) based empirical equations have been employed in this study to estimate the daily global solar radiation on a horizontal surface (G) for six meteorological stations in Burundi. Those equations include the Ångström-Prescott linear model and four amongst its derivatives, i.e. logarithmic, exponential, power and quadratic functions. Monthly mean values of daily global solar radiation and sunshine duration data for a period of 20 to 23 years, from the Geographical Institute of Burundi (IGEBU), have been used. For any of the six stations, ten single or double linear regressions have been developed from the above-said five functions, to relate in terms of monthly mean values, the daily clearness index () to each of the next two kinds of relative sunshine duration (RSD): and . In those ratios, G<sub>0</sub>, S<sub>0 </sub>and stand for the extraterrestrial daily solar radiation on a horizontal surface, the day length and the modified day length taking into account the natural site’s horizon, respectively. According to the calculated mean values of the clearness index and the RSD, each station experiences a high number of fairly clear (or partially cloudy) days. Estimated values of the dependent variable (y) in each developed linear regression, have been compared to measured values in terms of the coefficients of correlation (R) and of determination (R<sub>2</sub>), the mean bias error (MBE), the root mean square error (RMSE) and the t-statistics. Mean values of these statistical indicators have been used to rank, according to decreasing performance level, firstly the ten developed equations per station on account of the overall six stations, secondly the six stations on account of the overall ten equations. Nevertheless, the obtained values of those indicators lay in the next ranges for all the developed sixty equations:;;;, with . These results lead to assert that any of the sixty developed linear regressions (and thus equations in terms of and ), fits very adequately measured data, and should be used to estimate monthly average daily global solar radiation with sunshine duration for the relevant station. It is also found that using as RSD, is slightly more advantageous than using for estimating the monthly average daily clearness index, . Moreover, values of statistical indicators of this study match adequately data from other works on the same kinds of empirical equations.展开更多
A Bianchi type-V space time is considered with linear equation of state in the scalar tensor theory of gravitation proposed by Brans and Dicke. We use the assumption of constant deceleration parameter and power law re...A Bianchi type-V space time is considered with linear equation of state in the scalar tensor theory of gravitation proposed by Brans and Dicke. We use the assumption of constant deceleration parameter and power law relation between scalar field øand scale factor R to find the solutions. Some physical and kinematical properties of the model are also discussed.展开更多
The dynamic viscoelastic properties of asphalt AC-20 and its composites with Organic-Montmorillonite clay (OMMt) and SBS were modeled using the empirical Havriliak-Negami (HN) model, based on linear viscoelastic theor...The dynamic viscoelastic properties of asphalt AC-20 and its composites with Organic-Montmorillonite clay (OMMt) and SBS were modeled using the empirical Havriliak-Negami (HN) model, based on linear viscoelastic theory (LVE). The HN parameters, α, β, G0, G∞and τHN were determined by solving the HN equation across various temperatures and frequencies. The HN model successfully predicted the rheological behavior of the asphalt and its blends within the temperature range of 25˚C - 40˚C. However, deviations occurred between 40˚C - 75˚C, where the glass transition temperature Tg of the asphalt components and the SBS polymer are located, rendering the HN model ineffective for predicting the dynamic viscoelastic properties of composites containing OMMt under these conditions. Yet, the prediction error of the HN model dropped to 2.28% - 2.81% for asphalt and its mixtures at 100˚C, a temperature exceeding the Tg values of both polymer and asphalt, where the mixtures exhibited a liquid-like behavior. The exponent α and the relaxation time increased with temperature across all systems. Incorporating OMMt clay into the asphalt blends significantly enhanced the relaxation dynamics of the resulting composites.展开更多
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode...Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to elimin...The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.展开更多
The impact of nonlinear stability and instability on the validity of tangent linear model (TLM) is investigated by comparing its results with those produced by the nonlinear model (NLM) with the identical initial pert...The impact of nonlinear stability and instability on the validity of tangent linear model (TLM) is investigated by comparing its results with those produced by the nonlinear model (NLM) with the identical initial perturbations. The evolutions of different initial perturbations superposed on the nonlinearly stable and unstable basic flows are examined using the two-dimensional quasi-geostrophic models of double periodic-boundary condition and rigid boundary condition. The results indicate that the valid time period of TLM, during which TLM can be utilized to approximate NLM with given accuracy, varies with the magnitudes of the perturbations and the nonlinear stability and instability of the basic flows. The larger the magnitude of the perturbation is, the shorter the valid time period. The more nonlinearly unstable the basic flow is, the shorter the valid time period of TLM. With the double—periodic condition the valid period of the TLM is shorter than that with the rigid—boundary condition. Key words Nonlinear stability and instability - Tangent linear model (TLM) - Validity This work was supported by the National Key Basic Research Project “Research on the Formation Mechanism and Prediction Theory of Severe Synoptic Disasters in China” (No.G1998040910) and the National Natural Science Foundation of China (No.49775262 and No.49823002).展开更多
In order to design a nonlinear controller for small-scale autonomous helicopters, the dynamic characteristics of a model helicopter are investigated, and an integrated nonlinear model of a small-scale helicopter for h...In order to design a nonlinear controller for small-scale autonomous helicopters, the dynamic characteristics of a model helicopter are investigated, and an integrated nonlinear model of a small-scale helicopter for hovering control is presented. It is proved that the nonlinear system of the small-scale helicopter can be transformed to a linear system using the dynamic feedback linearization technique. Finally, simulations are carried out to validate the nonlinear controller.展开更多
This paper presents a novel adaptive nonlinear model predictive control design for trajectory tracking of flexible-link manipulators consisting of feedback linearization, linear model predictive control, and unscented...This paper presents a novel adaptive nonlinear model predictive control design for trajectory tracking of flexible-link manipulators consisting of feedback linearization, linear model predictive control, and unscented Kalman filtering. Reducing the nonlinear system to a linear system by feedback linearization simplifies the optimization problem of the model predictive controller significantly, which, however, is no longer linear in the presence of parameter uncertainties and can potentially lead to an undesired dynamical behaviour. An unscented Kalman filter is used to approximate the dynamics of the prediction model by an online parameter estimation, which leads to an adaptation of the optimization problem in each time step and thus to a better prediction and an improved input action. Finally, a detailed fuzzy-arithmetic analysis is performed in order to quantify the effect of the uncertainties on the control structure and to derive robustness assessments. The control structure is applied to a serial manipulator with two flexible links containing uncertain model parameters and acting in three-dimensional space.展开更多
A contact model for describing the contact mechanics between the stator and slider of the standing wave linear ultrasonic motor was presented. The proposed model starts from the assumption that the vibration character...A contact model for describing the contact mechanics between the stator and slider of the standing wave linear ultrasonic motor was presented. The proposed model starts from the assumption that the vibration characteristics of the stator is not affected by the contact process. A modified friction models was used to analyze the contact problems. Firstly, the dynamic normal contact force, interface friction force, and steady-state characteristics were analyzed. Secondly, the influences of the contact layer material, the dynamic characteristics of the stator, and the pre-load on motor performance were simulated. Finally, to validate the contact model, a linear ultrasonic motor based on in-plane modes was used as an example. The corresponding results show that a set of simulation of motor performances based on the proposed contact mechanism is in good agreement with experimental results. This model is helpful to understanding the operation principle of the standing wave linear motor and thus contributes to the design of these types of motor.展开更多
This paper proposes a new method for model predictive control (MPC) of nonlinear systems to calculate stability region and feasible initial control profile/sequence, which are important to the implementations of MPC...This paper proposes a new method for model predictive control (MPC) of nonlinear systems to calculate stability region and feasible initial control profile/sequence, which are important to the implementations of MPC. Different from many existing methods, this paper distinguishes stability region from conservative terminal region. With global linearization, linear differential inclusion (LDI) and linear matrix inequality (LMI) techniques, a nonlinear system is transformed into a convex set of linear systems, and then the vertices of the set are used off-line to design the controller, to estimate stability region, and also to determine a feasible initial control profile/sequence. The advantages of the proposed method are demonstrated by simulation study.展开更多
基金This study was supported by the National Natural Science Foundation of China(42261008,41971034)the Natural Science Foundation of Gansu Province,China(22JR5RA074).
文摘Stable water isotopes are natural tracers quantifying the contribution of moisture recycling to local precipitation,i.e.,the moisture recycling ratio,but various isotope-based models usually lead to different results,which affects the accuracy of local moisture recycling.In this study,a total of 18 stations from four typical areas in China were selected to compare the performance of isotope-based linear and Bayesian mixing models and to determine local moisture recycling ratio.Among the three vapor sources including advection,transpiration,and surface evaporation,the advection vapor usually played a dominant role,and the contribution of surface evaporation was less than that of transpiration.When the abnormal values were ignored,the arithmetic averages of differences between isotope-based linear and the Bayesian mixing models were 0.9%for transpiration,0.2%for surface evaporation,and–1.1%for advection,respectively,and the medians were 0.5%,0.2%,and–0.8%,respectively.The importance of transpiration was slightly less for most cases when the Bayesian mixing model was applied,and the contribution of advection was relatively larger.The Bayesian mixing model was found to perform better in determining an efficient solution since linear model sometimes resulted in negative contribution ratios.Sensitivity test with two isotope scenarios indicated that the Bayesian model had a relatively low sensitivity to the changes in isotope input,and it was important to accurately estimate the isotopes in precipitation vapor.Generally,the Bayesian mixing model should be recommended instead of a linear model.The findings are useful for understanding the performance of isotope-based linear and Bayesian mixing models under various climate backgrounds.
基金Project supported by the National Natural Science Foundation of China (Nos. 12172236, 12202289,and U21A20430)the Science and Technology Research Project of Hebei Education Department of China (No. QN2022083)。
文摘In this paper, the nonlinear free vibration behaviors of the piezoelectric semiconductor(PS) doubly-curved shell resting on the Pasternak foundation are studied within the framework of the nonlinear drift-diffusion(NLDD) model and the first-order shear deformation theory. The nonlinear constitutive relations are presented, and the strain energy, kinetic energy, and virtual work of the PS doubly-curved shell are derived.Based on Hamilton's principle as well as the condition of charge continuity, the nonlinear governing equations are achieved, and then these equations are solved by means of an efficient iteration method. Several numerical examples are given to show the effect of the nonlinear drift current, elastic foundation parameters as well as geometric parameters on the nonlinear vibration frequency, and the damping characteristic of the PS doublycurved shell. The main innovations of the manuscript are that the difference between the linearized drift-diffusion(LDD) model and the NLDD model is revealed, and an effective method is proposed to select a proper initial electron concentration for the LDD model.
基金This research was funded by the National Natural Science Foundation of China(No.62272124)the National Key Research and Development Program of China(No.2022YFB2701401)+3 种基金Guizhou Province Science and Technology Plan Project(Grant Nos.Qiankehe Paltform Talent[2020]5017)The Research Project of Guizhou University for Talent Introduction(No.[2020]61)the Cultivation Project of Guizhou University(No.[2019]56)the Open Fund of Key Laboratory of Advanced Manufacturing Technology,Ministry of Education(GZUAMT2021KF[01]).
文摘In the assessment of car insurance claims,the claim rate for car insurance presents a highly skewed probability distribution,which is typically modeled using Tweedie distribution.The traditional approach to obtaining the Tweedie regression model involves training on a centralized dataset,when the data is provided by multiple parties,training a privacy-preserving Tweedie regression model without exchanging raw data becomes a challenge.To address this issue,this study introduces a novel vertical federated learning-based Tweedie regression algorithm for multi-party auto insurance rate setting in data silos.The algorithm can keep sensitive data locally and uses privacy-preserving techniques to achieve intersection operations between the two parties holding the data.After determining which entities are shared,the participants train the model locally using the shared entity data to obtain the local generalized linear model intermediate parameters.The homomorphic encryption algorithms are introduced to interact with and update the model intermediate parameters to collaboratively complete the joint training of the car insurance rate-setting model.Performance tests on two publicly available datasets show that the proposed federated Tweedie regression algorithm can effectively generate Tweedie regression models that leverage the value of data fromboth partieswithout exchanging data.The assessment results of the scheme approach those of the Tweedie regressionmodel learned fromcentralized data,and outperformthe Tweedie regressionmodel learned independently by a single party.
基金Supported by the National Natural Science Foundation of China (60404012, 60674064), UK EPSRC (GR/N13319 and GR/R10875), the National High Technology Research and Development Program of China (2007AA04Z193), New Star of Science and Technology of Beijing City (2006A62), and IBM China Research Lab 2007 UR-Program.
基金This work was supported by the 2021 Project of the“14th Five-Year Plan”of Shaanxi Education Science“Research on the Application of Educational Data Mining in Applied Undergraduate Teaching-Taking the Course of‘Computer Application Technology’as an Example”(SGH21Y0403)the Teaching Reform and Research Projects for Practical Teaching in 2022“Research on Practical Teaching of Applied Undergraduate Projects Based on‘Combination of Courses and Certificates”-Taking Computer Application Technology Courses as an Example”(SJJG02012)the 11th batch of Teaching Reform Research Project of Xi’an Jiaotong University City College“Project-Driven Cultivation and Research on Information Literacy of Applied Undergraduate Students in the Information Times-Taking Computer Application Technology Course Teaching as an Example”(111001).
文摘Social network is the mainstream medium of current information dissemination,and it is particularly important to accurately predict its propagation law.In this paper,we introduce a social network propagation model integrating multiple linear regression and infectious disease model.Firstly,we proposed the features that affect social network communication from three dimensions.Then,we predicted the node influence via multiple linear regression.Lastly,we used the node influence as the state transition of the infectious disease model to predict the trend of information dissemination in social networks.The experimental results on a real social network dataset showed that the prediction results of the model are consistent with the actual information dissemination trends.
基金the financial support of the National Natural Science Foundation of China(Grant No.42074016,42104025,42274057and 41704007)Hunan Provincial Natural Science Foundation of China(Grant No.2021JJ30244)Scientific Research Fund of Hunan Provincial Education Department(Grant No.22B0496)。
文摘Weighted total least squares(WTLS)have been regarded as the standard tool for the errors-in-variables(EIV)model in which all the elements in the observation vector and the coefficient matrix are contaminated with random errors.However,in many geodetic applications,some elements are error-free and some random observations appear repeatedly in different positions in the augmented coefficient matrix.It is called the linear structured EIV(LSEIV)model.Two kinds of methods are proposed for the LSEIV model from functional and stochastic modifications.On the one hand,the functional part of the LSEIV model is modified into the errors-in-observations(EIO)model.On the other hand,the stochastic model is modified by applying the Moore-Penrose inverse of the cofactor matrix.The algorithms are derived through the Lagrange multipliers method and linear approximation.The estimation principles and iterative formula of the parameters are proven to be consistent.The first-order approximate variance-covariance matrix(VCM)of the parameters is also derived.A numerical example is given to compare the performances of our proposed three algorithms with the STLS approach.Afterwards,the least squares(LS),total least squares(TLS)and linear structured weighted total least squares(LSWTLS)solutions are compared and the accuracy evaluation formula is proven to be feasible and effective.Finally,the LSWTLS is applied to the field of deformation analysis,which yields a better result than the traditional LS and TLS estimations.
基金funded by the National Key Research and Development Program of China(No.2022YFD2200503-02)。
文摘The diameter distribution function(DDF)is a crucial tool for accurately predicting stand carbon storage(CS).The current key issue,however,is how to construct a high-precision DDF based on stand factors,site quality,and aridity index to predict stand CS in multi-species mixed forests with complex structures.This study used data from70 survey plots for mixed broadleaf Populus davidiana and Betula platyphylla forests in the Mulan Rangeland State Forest,Hebei Province,China,to construct the DDF based on maximum likelihood estimation and finite mixture model(FMM).Ordinary least squares(OLS),linear seemingly unrelated regression(LSUR),and back propagation neural network(BPNN)were used to investigate the influences of stand factors,site quality,and aridity index on the shape and scale parameters of DDF and predicted stand CS of mixed broadleaf forests.The results showed that FMM accurately described the stand-level diameter distribution of the mixed P.davidiana and B.platyphylla forests;whereas the Weibull function constructed by MLE was more accurate in describing species-level diameter distribution.The combined variable of quadratic mean diameter(Dq),stand basal area(BA),and site quality improved the accuracy of the shape parameter models of FMM;the combined variable of Dq,BA,and De Martonne aridity index improved the accuracy of the scale parameter models.Compared to OLS and LSUR,the BPNN had higher accuracy in the re-parameterization process of FMM.OLS,LSUR,and BPNN overestimated the CS of P.davidiana but underestimated the CS of B.platyphylla in the large diameter classes(DBH≥18 cm).BPNN accurately estimated stand-and species-level CS,but it was more suitable for estimating stand-level CS compared to species-level CS,thereby providing a scientific basis for the optimization of stand structure and assessment of carbon sequestration capacity in mixed broadleaf forests.
文摘Effort estimation plays a crucial role in software development projects,aiding in resource allocation,project planning,and risk management.Traditional estimation techniques often struggle to provide accurate estimates due to the complex nature of software projects.In recent years,machine learning approaches have shown promise in improving the accuracy of effort estimation models.This study proposes a hybrid model that combines Long Short-Term Memory(LSTM)and Random Forest(RF)algorithms to enhance software effort estimation.The proposed hybrid model takes advantage of the strengths of both LSTM and RF algorithms.To evaluate the performance of the hybrid model,an extensive set of software development projects is used as the experimental dataset.The experimental results demonstrate that the proposed hybrid model outperforms traditional estimation techniques in terms of accuracy and reliability.The integration of LSTM and RF enables the model to efficiently capture temporal dependencies and non-linear interactions in the software development data.The hybrid model enhances estimation accuracy,enabling project managers and stakeholders to make more precise predictions of effort needed for upcoming software projects.
文摘Adaptive fractional polynomial modeling of general correlated outcomes is formulated to address nonlinearity in means, variances/dispersions, and correlations. Means and variances/dispersions are modeled using generalized linear models in fixed effects/coefficients. Correlations are modeled using random effects/coefficients. Nonlinearity is addressed using power transforms of primary (untransformed) predictors. Parameter estimation is based on extended linear mixed modeling generalizing both generalized estimating equations and linear mixed modeling. Models are evaluated using likelihood cross-validation (LCV) scores and are generated adaptively using a heuristic search controlled by LCV scores. Cases covered include linear, Poisson, logistic, exponential, and discrete regression of correlated continuous, count/rate, dichotomous, positive continuous, and discrete numeric outcomes treated as normally, Poisson, Bernoulli, exponentially, and discrete numerically distributed, respectively. Example analyses are also generated for these five cases to compare adaptive random effects/coefficients modeling of correlated outcomes to previously developed adaptive modeling based on directly specified covariance structures. Adaptive random effects/coefficients modeling substantially outperforms direct covariance modeling in the linear, exponential, and discrete regression example analyses. It generates equivalent results in the logistic regression example analyses and it is substantially outperformed in the Poisson regression case. Random effects/coefficients modeling of correlated outcomes can provide substantial improvements in model selection compared to directly specified covariance modeling. However, directly specified covariance modeling can generate competitive or substantially better results in some cases while usually requiring less computation time.
文摘Sunshine duration (S) based empirical equations have been employed in this study to estimate the daily global solar radiation on a horizontal surface (G) for six meteorological stations in Burundi. Those equations include the Ångström-Prescott linear model and four amongst its derivatives, i.e. logarithmic, exponential, power and quadratic functions. Monthly mean values of daily global solar radiation and sunshine duration data for a period of 20 to 23 years, from the Geographical Institute of Burundi (IGEBU), have been used. For any of the six stations, ten single or double linear regressions have been developed from the above-said five functions, to relate in terms of monthly mean values, the daily clearness index () to each of the next two kinds of relative sunshine duration (RSD): and . In those ratios, G<sub>0</sub>, S<sub>0 </sub>and stand for the extraterrestrial daily solar radiation on a horizontal surface, the day length and the modified day length taking into account the natural site’s horizon, respectively. According to the calculated mean values of the clearness index and the RSD, each station experiences a high number of fairly clear (or partially cloudy) days. Estimated values of the dependent variable (y) in each developed linear regression, have been compared to measured values in terms of the coefficients of correlation (R) and of determination (R<sub>2</sub>), the mean bias error (MBE), the root mean square error (RMSE) and the t-statistics. Mean values of these statistical indicators have been used to rank, according to decreasing performance level, firstly the ten developed equations per station on account of the overall six stations, secondly the six stations on account of the overall ten equations. Nevertheless, the obtained values of those indicators lay in the next ranges for all the developed sixty equations:;;;, with . These results lead to assert that any of the sixty developed linear regressions (and thus equations in terms of and ), fits very adequately measured data, and should be used to estimate monthly average daily global solar radiation with sunshine duration for the relevant station. It is also found that using as RSD, is slightly more advantageous than using for estimating the monthly average daily clearness index, . Moreover, values of statistical indicators of this study match adequately data from other works on the same kinds of empirical equations.
文摘A Bianchi type-V space time is considered with linear equation of state in the scalar tensor theory of gravitation proposed by Brans and Dicke. We use the assumption of constant deceleration parameter and power law relation between scalar field øand scale factor R to find the solutions. Some physical and kinematical properties of the model are also discussed.
文摘The dynamic viscoelastic properties of asphalt AC-20 and its composites with Organic-Montmorillonite clay (OMMt) and SBS were modeled using the empirical Havriliak-Negami (HN) model, based on linear viscoelastic theory (LVE). The HN parameters, α, β, G0, G∞and τHN were determined by solving the HN equation across various temperatures and frequencies. The HN model successfully predicted the rheological behavior of the asphalt and its blends within the temperature range of 25˚C - 40˚C. However, deviations occurred between 40˚C - 75˚C, where the glass transition temperature Tg of the asphalt components and the SBS polymer are located, rendering the HN model ineffective for predicting the dynamic viscoelastic properties of composites containing OMMt under these conditions. Yet, the prediction error of the HN model dropped to 2.28% - 2.81% for asphalt and its mixtures at 100˚C, a temperature exceeding the Tg values of both polymer and asphalt, where the mixtures exhibited a liquid-like behavior. The exponent α and the relaxation time increased with temperature across all systems. Incorporating OMMt clay into the asphalt blends significantly enhanced the relaxation dynamics of the resulting composites.
文摘Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金supported by the National Natural Science Foundation of China(71071077)the Ministry of Education Key Project of National Educational Science Planning(DFA090215)+1 种基金China Postdoctoral Science Foundation(20100481137)Funding of Jiangsu Innovation Program for Graduate Education(CXZZ11-0226)
文摘The construction method of background value is improved in the original multi-variable grey model (MGM(1,m)) from its source of construction errors. The MGM(1,m) with optimized background value is used to eliminate the random fluctuations or errors of the observational data of all variables, and the combined prediction model together with the multiple linear regression is established in order to improve the simulation and prediction accuracy of the combined model. Finally, a combined model of the MGM(1,2) with optimized background value and the binary linear regression is constructed by an example. The results show that the model has good effects for simulation and prediction.
文摘The impact of nonlinear stability and instability on the validity of tangent linear model (TLM) is investigated by comparing its results with those produced by the nonlinear model (NLM) with the identical initial perturbations. The evolutions of different initial perturbations superposed on the nonlinearly stable and unstable basic flows are examined using the two-dimensional quasi-geostrophic models of double periodic-boundary condition and rigid boundary condition. The results indicate that the valid time period of TLM, during which TLM can be utilized to approximate NLM with given accuracy, varies with the magnitudes of the perturbations and the nonlinear stability and instability of the basic flows. The larger the magnitude of the perturbation is, the shorter the valid time period. The more nonlinearly unstable the basic flow is, the shorter the valid time period of TLM. With the double—periodic condition the valid period of the TLM is shorter than that with the rigid—boundary condition. Key words Nonlinear stability and instability - Tangent linear model (TLM) - Validity This work was supported by the National Key Basic Research Project “Research on the Formation Mechanism and Prediction Theory of Severe Synoptic Disasters in China” (No.G1998040910) and the National Natural Science Foundation of China (No.49775262 and No.49823002).
基金supported by the National Natural Science Foundation of China (No.60975023)
文摘In order to design a nonlinear controller for small-scale autonomous helicopters, the dynamic characteristics of a model helicopter are investigated, and an integrated nonlinear model of a small-scale helicopter for hovering control is presented. It is proved that the nonlinear system of the small-scale helicopter can be transformed to a linear system using the dynamic feedback linearization technique. Finally, simulations are carried out to validate the nonlinear controller.
文摘This paper presents a novel adaptive nonlinear model predictive control design for trajectory tracking of flexible-link manipulators consisting of feedback linearization, linear model predictive control, and unscented Kalman filtering. Reducing the nonlinear system to a linear system by feedback linearization simplifies the optimization problem of the model predictive controller significantly, which, however, is no longer linear in the presence of parameter uncertainties and can potentially lead to an undesired dynamical behaviour. An unscented Kalman filter is used to approximate the dynamics of the prediction model by an online parameter estimation, which leads to an adaptation of the optimization problem in each time step and thus to a better prediction and an improved input action. Finally, a detailed fuzzy-arithmetic analysis is performed in order to quantify the effect of the uncertainties on the control structure and to derive robustness assessments. The control structure is applied to a serial manipulator with two flexible links containing uncertain model parameters and acting in three-dimensional space.
基金Funded by the National Basic Research Program (973 program) (No. 2011CB707602)the Digital Manufacturing Equipment and Technology National Key Laboratory,Huazhong University of Science and Technology (No. DMETKF2009002)National Sciences Foundation-Guangdong Natural Science Foundation,China (No.U0934004)
文摘A contact model for describing the contact mechanics between the stator and slider of the standing wave linear ultrasonic motor was presented. The proposed model starts from the assumption that the vibration characteristics of the stator is not affected by the contact process. A modified friction models was used to analyze the contact problems. Firstly, the dynamic normal contact force, interface friction force, and steady-state characteristics were analyzed. Secondly, the influences of the contact layer material, the dynamic characteristics of the stator, and the pre-load on motor performance were simulated. Finally, to validate the contact model, a linear ultrasonic motor based on in-plane modes was used as an example. The corresponding results show that a set of simulation of motor performances based on the proposed contact mechanism is in good agreement with experimental results. This model is helpful to understanding the operation principle of the standing wave linear motor and thus contributes to the design of these types of motor.
基金This work was supported by an Overseas Research Students Award to Xiao-Bing Hu.
文摘This paper proposes a new method for model predictive control (MPC) of nonlinear systems to calculate stability region and feasible initial control profile/sequence, which are important to the implementations of MPC. Different from many existing methods, this paper distinguishes stability region from conservative terminal region. With global linearization, linear differential inclusion (LDI) and linear matrix inequality (LMI) techniques, a nonlinear system is transformed into a convex set of linear systems, and then the vertices of the set are used off-line to design the controller, to estimate stability region, and also to determine a feasible initial control profile/sequence. The advantages of the proposed method are demonstrated by simulation study.