Adaptive fractional polynomial modeling of general correlated outcomes is formulated to address nonlinearity in means, variances/dispersions, and correlations. Means and variances/dispersions are modeled using general...Adaptive fractional polynomial modeling of general correlated outcomes is formulated to address nonlinearity in means, variances/dispersions, and correlations. Means and variances/dispersions are modeled using generalized linear models in fixed effects/coefficients. Correlations are modeled using random effects/coefficients. Nonlinearity is addressed using power transforms of primary (untransformed) predictors. Parameter estimation is based on extended linear mixed modeling generalizing both generalized estimating equations and linear mixed modeling. Models are evaluated using likelihood cross-validation (LCV) scores and are generated adaptively using a heuristic search controlled by LCV scores. Cases covered include linear, Poisson, logistic, exponential, and discrete regression of correlated continuous, count/rate, dichotomous, positive continuous, and discrete numeric outcomes treated as normally, Poisson, Bernoulli, exponentially, and discrete numerically distributed, respectively. Example analyses are also generated for these five cases to compare adaptive random effects/coefficients modeling of correlated outcomes to previously developed adaptive modeling based on directly specified covariance structures. Adaptive random effects/coefficients modeling substantially outperforms direct covariance modeling in the linear, exponential, and discrete regression example analyses. It generates equivalent results in the logistic regression example analyses and it is substantially outperformed in the Poisson regression case. Random effects/coefficients modeling of correlated outcomes can provide substantial improvements in model selection compared to directly specified covariance modeling. However, directly specified covariance modeling can generate competitive or substantially better results in some cases while usually requiring less computation time.展开更多
In this paper, a model averaging method is proposed for varying-coefficient models with response missing at random by establishing a weight selection criterion based on cross-validation. Under certain regularity condi...In this paper, a model averaging method is proposed for varying-coefficient models with response missing at random by establishing a weight selection criterion based on cross-validation. Under certain regularity conditions, it is proved that the proposed method is asymptotically optimal in the sense of achieving the minimum squared error.展开更多
To estimate the parameters of the mixed additive and multiplicative(MAM)random error model using the weighted least squares iterative algorithm that requires derivation of the complex weight array,we introduce a deriv...To estimate the parameters of the mixed additive and multiplicative(MAM)random error model using the weighted least squares iterative algorithm that requires derivation of the complex weight array,we introduce a derivative-free cat swarm optimization for parameter estimation.We embed the Powell method,which uses conjugate direction acceleration and does not need to derive the objective function,into the original cat swarm optimization to accelerate its convergence speed and search accuracy.We use the ordinary least squares,weighted least squares,original cat swarm optimization,particle swarm algorithm and improved cat swarm optimization to estimate the parameters of the straight-line fitting MAM model with lower nonlinearity and the DEM MAM model with higher nonlinearity,respectively.The experimental results show that the improved cat swarm optimization has faster convergence speed,higher search accuracy,and better stability than the original cat swarm optimization and the particle swarm algorithm.At the same time,the improved cat swarm optimization can obtain results consistent with the weighted least squares method based on the objective function only while avoiding multiple complex weight array derivations.The method in this paper provides a new idea for theoretical research on parameter estimation of MAM error models.展开更多
Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a n...Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a nonlinear random coefficient regression(RCR) model with fusing failure time data.Firstly, some interesting natures of parameters estimation based on the nonlinear RCR model are given. Based on these natures,the failure time data can be fused as the prior information reasonably. Specifically, the fixed parameters are calculated by the field degradation data of the evaluated equipment and the prior information of random coefficient is estimated with fusing the failure time data of congeneric equipment. Then, the prior information of the random coefficient is updated online under the Bayesian framework, the probability density function(PDF) of the RUL with considering the limitation of the failure threshold is performed. Finally, two case studies are used for experimental verification. Compared with the traditional Bayesian method, the proposed method can effectively reduce the influence of imperfect prior information and improve the accuracy of RUL prediction.展开更多
BACKGROUND Hypertension is a major risk factor for cardiovascular disease and stroke,and its prevalence is increasing worldwide.Health education interventions based on the health belief model(HBM)can improve the knowl...BACKGROUND Hypertension is a major risk factor for cardiovascular disease and stroke,and its prevalence is increasing worldwide.Health education interventions based on the health belief model(HBM)can improve the knowledge,attitudes,and behaviors of patients with hypertension and help them control their blood pressure.AIM To evaluate the effects of health education interventions based on the HBM in patients with hypertension in China.METHODS Between 2021 and 2023,140 patients with hypertension were randomly assigned to either the intervention or control group.The intervention group received health education based on the HBM,including lectures,brochures,videos,and counseling sessions,whereas the control group received routine care.Outcomes were measured at baseline,three months,and six months after the intervention and included blood pressure,medication adherence,self-efficacy,and perceived benefits,barriers,susceptibility,and severity.RESULTS The intervention group had significantly lower systolic blood pressure[mean difference(MD):-8.2 mmHg,P<0.001]and diastolic blood pressure(MD:-5.1 mmHg,P=0.002)compared to the control group at six months.The intervention group also had higher medication adherence(MD:1.8,P<0.001),self-efficacy(MD:12.4,P<0.001),perceived benefits(MD:3.2,P<0.001),lower perceived barriers(MD:-2.6,P=0.001),higher perceived susceptibility(MD:2.8,P=0.002),and higher perceived severity(MD:3.1,P<0.001)than the control group at six months.CONCLUSION Health education interventions based on the HBM effectively improve blood pressure control and health beliefs in patients with hypertension and should be implemented in clinical practice and community settings.展开更多
Objective Body fluid mixtures are complex biological samples that frequently occur in crime scenes,and can provide important clues for criminal case analysis.DNA methylation assay has been applied in the identificatio...Objective Body fluid mixtures are complex biological samples that frequently occur in crime scenes,and can provide important clues for criminal case analysis.DNA methylation assay has been applied in the identification of human body fluids,and has exhibited excellent performance in predicting single-source body fluids.The present study aims to develop a methylation SNaPshot multiplex system for body fluid identification,and accurately predict the mixture samples.In addition,the value of DNA methylation in the prediction of body fluid mixtures was further explored.Methods In the present study,420 samples of body fluid mixtures and 250 samples of single body fluids were tested using an optimized multiplex methylation system.Each kind of body fluid sample presented the specific methylation profiles of the 10 markers.Results Significant differences in methylation levels were observed between the mixtures and single body fluids.For all kinds of mixtures,the Spearman’s correlation analysis revealed a significantly strong correlation between the methylation levels and component proportions(1:20,1:10,1:5,1:1,5:1,10:1 and 20:1).Two random forest classification models were trained for the prediction of mixture types and the prediction of the mixture proportion of 2 components,based on the methylation levels of 10 markers.For the mixture prediction,Model-1 presented outstanding prediction accuracy,which reached up to 99.3%in 427 training samples,and had a remarkable accuracy of 100%in 243 independent test samples.For the mixture proportion prediction,Model-2 demonstrated an excellent accuracy of 98.8%in 252 training samples,and 98.2%in 168 independent test samples.The total prediction accuracy reached 99.3%for body fluid mixtures and 98.6%for the mixture proportions.Conclusion These results indicate the excellent capability and powerful value of the multiplex methylation system in the identification of forensic body fluid mixtures.展开更多
Driven piles are used in many geological environments as a practical and convenient structural component.Hence,the determination of the drivability of piles is actually of great importance in complex geotechnical appl...Driven piles are used in many geological environments as a practical and convenient structural component.Hence,the determination of the drivability of piles is actually of great importance in complex geotechnical applications.Conventional methods of predicting pile drivability often rely on simplified physicalmodels or empirical formulas,whichmay lack accuracy or applicability in complex geological conditions.Therefore,this study presents a practical machine learning approach,namely a Random Forest(RF)optimized by Bayesian Optimization(BO)and Particle Swarm Optimization(PSO),which not only enhances prediction accuracy but also better adapts to varying geological environments to predict the drivability parameters of piles(i.e.,maximumcompressive stress,maximum tensile stress,and blow per foot).In addition,support vector regression,extreme gradient boosting,k nearest neighbor,and decision tree are also used and applied for comparison purposes.In order to train and test these models,among the 4072 datasets collected with 17model inputs,3258 datasets were randomly selected for training,and the remaining 814 datasets were used for model testing.Lastly,the results of these models were compared and evaluated using two performance indices,i.e.,the root mean square error(RMSE)and the coefficient of determination(R2).The results indicate that the optimized RF model achieved lower RMSE than other prediction models in predicting the three parameters,specifically 0.044,0.438,and 0.146;and higher R^(2) values than other implemented techniques,specifically 0.966,0.884,and 0.977.In addition,the sensitivity and uncertainty of the optimized RF model were analyzed using Sobol sensitivity analysis and Monte Carlo(MC)simulation.It can be concluded that the optimized RF model could be used to predict the performance of the pile,and it may provide a useful reference for solving some problems under similar engineering conditions.展开更多
BACKGROUND Liver cancer is one of the most prevalent malignant tumors worldwide,and its early detection and treatment are crucial for enhancing patient survival rates and quality of life.However,the early symptoms of ...BACKGROUND Liver cancer is one of the most prevalent malignant tumors worldwide,and its early detection and treatment are crucial for enhancing patient survival rates and quality of life.However,the early symptoms of liver cancer are often not obvious,resulting in a late-stage diagnosis in many patients,which significantly reduces the effectiveness of treatment.Developing a highly targeted,widely applicable,and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.AIM To develop a liver cancer risk prediction model by employing machine learning techniques,and subsequently assess its performance.METHODS In this study,a total of 550 patients were enrolled,with 190 hepatocellular carcinoma(HCC)and 195 cirrhosis patients serving as the training cohort,and 83 HCC and 82 cirrhosis patients forming the validation cohort.Logistic regression(LR),support vector machine(SVM),random forest(RF),and least absolute shrinkage and selection operator(LASSO)regression models were developed in the training cohort.Model performance was assessed in the validation cohort.Additionally,this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve,calibration curve,and decision curve analysis(DCA)to determine the optimal predictive model for assessing liver cancer risk.RESULTS Six variables including age,white blood cell,red blood cell,platelet counts,alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR,SVM,RF,and LASSO regression models.The RF model exhibited superior discrimination,and the area under curve of the training and validation sets was 0.969 and 0.858,respectively.These values significantly surpassed those of the LR(0.850 and 0.827),SVM(0.860 and 0.803),LASSO regression(0.845 and 0.831),and ASAP(0.866 and 0.813)models.Furthermore,calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.CONCLUSION The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.展开更多
BACKGROUND Lymph node ratio(LNR)was demonstrated to play a crucial role in the prognosis of many tumors.However,research concerning the prognostic value of LNR in postoperative gastric neuroendocrine neoplasm(NEN)pati...BACKGROUND Lymph node ratio(LNR)was demonstrated to play a crucial role in the prognosis of many tumors.However,research concerning the prognostic value of LNR in postoperative gastric neuroendocrine neoplasm(NEN)patients was limited.AIM To explore the prognostic value of LNR in postoperative gastric NEN patients and to combine LNR to develop prognostic models.METHODS A total of 286 patients from the Surveillance,Epidemiology,and End Results database were divided into the training set and validation set at a ratio of 8:2.92 patients from the First Affiliated Hospital of Soochow University in China were designated as a test set.Cox regression analysis was used to explore the relationship between LNR and disease-specific survival(DSS)of gastric NEN patients.Random survival forest(RSF)algorithm and Cox proportional hazards(CoxPH)analysis were applied to develop models to predict DSS respectively,and compared with the 8th edition American Joint Committee on Cancer(AJCC)tumornode-metastasis(TNM)staging.RESULTS Multivariate analyses indicated that LNR was an independent prognostic factor for postoperative gastric NEN patients and a higher LNR was accompanied by a higher risk of death.The RSF model exhibited the best performance in predicting DSS,with the C-index in the test set being 0.769[95%confidence interval(CI):0.691-0.846]outperforming the CoxPH model(0.744,95%CI:0.665-0.822)and the 8th edition AJCC TNM staging(0.723,95%CI:0.613-0.833).The calibration curves and decision curve analysis(DCA)demonstrated the RSF model had good calibration and clinical benefits.Furthermore,the RSF model could perform risk stratification and individual prognosis prediction effectively.CONCLUSION A higher LNR indicated a lower DSS in postoperative gastric NEN patients.The RSF model outperformed the CoxPH model and the 8th edition AJCC TNM staging in the test set,showing potential in clinical practice.展开更多
Machine learning is currently one of the research hotspots in the field of landslide prediction.To clarify and evaluate the differences in characteristics and prediction effects of different machine learning models,Co...Machine learning is currently one of the research hotspots in the field of landslide prediction.To clarify and evaluate the differences in characteristics and prediction effects of different machine learning models,Conghua District,which is the most prone to landslide disasters in Guangzhou,was selected for landslide susceptibility evaluation.The evaluation factors were selected by using correlation analysis and variance expansion factor method.Applying four machine learning methods namely Logistic Regression(LR),Random Forest(RF),Support Vector Machines(SVM),and Extreme Gradient Boosting(XGB),landslide models were constructed.Comparative analysis and evaluation of the model were conducted through statistical indices and receiver operating characteristic(ROC)curves.The results showed that LR,RF,SVM,and XGB models have good predictive performance for landslide susceptibility,with the area under curve(AUC)values of 0.752,0.965,0.996,and 0.998,respectively.XGB model had the highest predictive ability,followed by RF model,SVM model,and LR model.The frequency ratio(FR)accuracy of LR,RF,SVM,and XGB models was 0.775,0.842,0.759,and 0.822,respectively.RF and XGB models were superior to LR and SVM models,indicating that the integrated algorithm has better predictive ability than a single classification algorithm in regional landslide classification problems.展开更多
We estimate tree heights using polarimetric interferometric synthetic aperture radar(PolInSAR)data constructed by the dual-polarization(dual-pol)SAR data and random volume over the ground(RVoG)model.Considering the Se...We estimate tree heights using polarimetric interferometric synthetic aperture radar(PolInSAR)data constructed by the dual-polarization(dual-pol)SAR data and random volume over the ground(RVoG)model.Considering the Sentinel-1 SAR dual-pol(SVV,vertically transmitted and vertically received and SVH,vertically transmitted and horizontally received)configuration,one notes that S_(HH),the horizontally transmitted and horizontally received scattering element,is unavailable.The S_(HH)data were constructed using the SVH data,and polarimetric SAR(PolSAR)data were obtained.The proposed approach was first verified in simulation with satisfactory results.It was next applied to construct PolInSAR data by a pair of dual-pol Sentinel-1A data at Duke Forest,North Carolina,USA.According to local observations and forest descriptions,the range of estimated tree heights was overall reasonable.Comparing the heights with the ICESat-2 tree heights at 23 sampling locations,relative errors of 5 points were within±30%.Errors of 8 points ranged from 30%to 40%,but errors of the remaining 10 points were>40%.The results should be encouraged as error reduction is possible.For instance,the construction of PolSAR data should not be limited to using SVH,and a combination of SVH and SVV should be explored.Also,an ensemble of tree heights derived from multiple PolInSAR data can be considered since tree heights do not vary much with time frame in months or one season.展开更多
Tree height (H) in a natural stand or forest plantation is a fundamental variable in management, and the use of mathematical expressions that estimate H as a function of diameter at breast height (d) or variables at t...Tree height (H) in a natural stand or forest plantation is a fundamental variable in management, and the use of mathematical expressions that estimate H as a function of diameter at breast height (d) or variables at the stand level is a valuable support tool in forest inventories. The objective was to fit and propose a generalized H-d model for Pinus montezumae and Pinus pseudostrobus established in forest plantations of Nuevo San Juan Parangaricutiro, Michoacan, Mexico. Using nonlinear least squares (NLS), 10 generalized H-d models were fitted to 883 and 1226 pairs of H-d data from Pinus montezumae and Pinus pseudostrobus, respectively. The best model was refitted with the maximum likelihood mixed effects model (MEM) approach by including the site as a classification variable and a known variance structure. The Wang and Tang equation was selected as the best model with NLS;the MEM with an additive effect on two of its parameters and an exponential variance function improved the fit statistics for Pinus montezumae and Pinus pseudostrobus, respectively. The model validation showed equality of means among the estimates for both species and an independent subsample. The calibration of the MEM at the plot level was efficient and might increase the applicability of these results. The inclusion of dominant height in the MEM approach helped to reduce bias in the estimates and also to better explain the variability among plots.展开更多
The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and d...The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and duration features. When the tone model is integrated into continuous speech recognition, the discriminative model weight training (DMWT) is proposed. Acoustic and tone scores are scaled by model weights discriminatively trained by the minimum phone error (MPE) criterion. Two schemes of weight training are evaluated and a smoothing technique is used to make training robust to overtraining problem. Experiments show that the accuracies of tone recognition and large vocabulary continuous speech recognition (LVCSR) can be improved by the HCRFs based tone model. Compared with the global weight scheme, continuous speech recognition can be improved by the discriminative trained weight combinations.展开更多
In this paper, a compound binomial model with a constant dividend barrier and random income is considered. Two types of individual claims, main claims and by-claims, are defined, where every by-claim is induced by the...In this paper, a compound binomial model with a constant dividend barrier and random income is considered. Two types of individual claims, main claims and by-claims, are defined, where every by-claim is induced by the main claim and may be delayed for one time period with a certain probability. The premium income is assumed to another binomial process to capture the uncertainty of the customer's arrivals and payments. A system of difference equations with certain boundary conditions for the expected present value of total dividend payments prior to ruin is derived and solved. Explicit results are obtained when the claim sizes are Kn distributed or the claim size distributions have finite support. Numerical results are also provided to illustrate the impact of the delay of by-claims on the expected present value of dividends.展开更多
This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting de...This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting decision tree(GBDT), random forest(RF) and information value(InV) models, were used, and the performances were assessed and compared. In total, 202 landslides were mapped by using a series of field surveys, aerial photographs, and reviews of historical and bibliographical data. Nine causative factors were then considered in landslide susceptibility map generation by using the GBDT, RF and InV models. All of the maps of the causative factors were resampled to a resolution of 28.5 m. Of the 486289 pixels in the area,28526 pixels were landslide pixels, and 457763 pixels were non-landslide pixels. Finally, landslide susceptibility maps were generated by using the three machine learning models, and their performances were assessed through receiver operating characteristic(ROC) curves, the sensitivity, specificity,overall accuracy(OA), and kappa coefficient(KAPPA). The results showed that the GBDT, RF and In V models in overall produced reasonable accurate landslide susceptibility maps. Among these three methods, the GBDT method outperforms the other two machine learning methods, which can provide strong technical support for producing landslide susceptibility maps in TGR.展开更多
The dual random models about the life insurance and social pension insurance have received considerable attention in the recent articles on actuarial theory and applications. This paper discusses a general kind of inc...The dual random models about the life insurance and social pension insurance have received considerable attention in the recent articles on actuarial theory and applications. This paper discusses a general kind of increasing annuity based on its force of interest accumulation function as a general random process. The dual random model of the present value of the benefits of the increasing annuity has been set, and their moments have been calculated under certain conditions.展开更多
In this paper,a unified diagnostic method for the nonlinear models with random effects based upon the joint likelihood given by Robinson in 1991 is presented.It is shown that the case deletion model is equivalent to t...In this paper,a unified diagnostic method for the nonlinear models with random effects based upon the joint likelihood given by Robinson in 1991 is presented.It is shown that the case deletion model is equivalent to the mean shift outlier model.From this point of view,several diagnostic measures,such as Cook distance,score statistics are derived.The local influence measure of Cook is also presented. A numerical example illustrates that the method is available.展开更多
Bayes decision rule of variance components for one-way random effects model is derived and empirical Bayes (EB) decision rules are constructed by kernel estimation method. Under suitable conditions, it is shown that t...Bayes decision rule of variance components for one-way random effects model is derived and empirical Bayes (EB) decision rules are constructed by kernel estimation method. Under suitable conditions, it is shown that the proposed EB decision rules are asymptotically optimal with convergence rates near O(n-1/2). Finally, an example concerning the main result is given.展开更多
According to the theoretical solutions for the nonlinear three-dimensional gravity surface waves and their interactions with vertical wall previously proposed by the lead author, in this paper an exact second-order ra...According to the theoretical solutions for the nonlinear three-dimensional gravity surface waves and their interactions with vertical wall previously proposed by the lead author, in this paper an exact second-order random model of the unified wave motion process for nonlinear irregular waves and their interactions with vertical wall in uniform current is formulated, the corresponding theoretical nonlinear spectrum is derived, and the digital simulation model suitable to the use of the FFT (Fast Fourier Transform) algorithm is also given. Simulations of wave surface, wave pressure, total wave pressure and its moment are performed. The probability properties and statistical characteristics of these realizations are tested, which include the verifications of normality for linear process and of non-normality for nonlinear process; the consistencies of the theoretical spectra with simulated ones; the probability properties of apparent characteristics, such as amplitudes, periods, and extremes (maximum and minimum, positive and negative extremes). The statistical analysis and comparisons demonstrate that the proposed theoretical and computing models are realistic and effective, and estimated spectra are in good agreement with the theoretical ones, and the probability properties of the simulated waves are similar to those of the sea waves. At the same time, the simulating computation can be completed rapidly and easily.展开更多
文摘Adaptive fractional polynomial modeling of general correlated outcomes is formulated to address nonlinearity in means, variances/dispersions, and correlations. Means and variances/dispersions are modeled using generalized linear models in fixed effects/coefficients. Correlations are modeled using random effects/coefficients. Nonlinearity is addressed using power transforms of primary (untransformed) predictors. Parameter estimation is based on extended linear mixed modeling generalizing both generalized estimating equations and linear mixed modeling. Models are evaluated using likelihood cross-validation (LCV) scores and are generated adaptively using a heuristic search controlled by LCV scores. Cases covered include linear, Poisson, logistic, exponential, and discrete regression of correlated continuous, count/rate, dichotomous, positive continuous, and discrete numeric outcomes treated as normally, Poisson, Bernoulli, exponentially, and discrete numerically distributed, respectively. Example analyses are also generated for these five cases to compare adaptive random effects/coefficients modeling of correlated outcomes to previously developed adaptive modeling based on directly specified covariance structures. Adaptive random effects/coefficients modeling substantially outperforms direct covariance modeling in the linear, exponential, and discrete regression example analyses. It generates equivalent results in the logistic regression example analyses and it is substantially outperformed in the Poisson regression case. Random effects/coefficients modeling of correlated outcomes can provide substantial improvements in model selection compared to directly specified covariance modeling. However, directly specified covariance modeling can generate competitive or substantially better results in some cases while usually requiring less computation time.
文摘In this paper, a model averaging method is proposed for varying-coefficient models with response missing at random by establishing a weight selection criterion based on cross-validation. Under certain regularity conditions, it is proved that the proposed method is asymptotically optimal in the sense of achieving the minimum squared error.
基金supported by the National Natural Science Foundation of China(No.42174011 and No.41874001).
文摘To estimate the parameters of the mixed additive and multiplicative(MAM)random error model using the weighted least squares iterative algorithm that requires derivation of the complex weight array,we introduce a derivative-free cat swarm optimization for parameter estimation.We embed the Powell method,which uses conjugate direction acceleration and does not need to derive the objective function,into the original cat swarm optimization to accelerate its convergence speed and search accuracy.We use the ordinary least squares,weighted least squares,original cat swarm optimization,particle swarm algorithm and improved cat swarm optimization to estimate the parameters of the straight-line fitting MAM model with lower nonlinearity and the DEM MAM model with higher nonlinearity,respectively.The experimental results show that the improved cat swarm optimization has faster convergence speed,higher search accuracy,and better stability than the original cat swarm optimization and the particle swarm algorithm.At the same time,the improved cat swarm optimization can obtain results consistent with the weighted least squares method based on the objective function only while avoiding multiple complex weight array derivations.The method in this paper provides a new idea for theoretical research on parameter estimation of MAM error models.
基金supported by National Natural Science Foundation of China (61703410,61873175,62073336,61873273,61773386,61922089)。
文摘Remaining useful life(RUL) prediction is one of the most crucial elements in prognostics and health management(PHM). Aiming at the imperfect prior information, this paper proposes an RUL prediction method based on a nonlinear random coefficient regression(RCR) model with fusing failure time data.Firstly, some interesting natures of parameters estimation based on the nonlinear RCR model are given. Based on these natures,the failure time data can be fused as the prior information reasonably. Specifically, the fixed parameters are calculated by the field degradation data of the evaluated equipment and the prior information of random coefficient is estimated with fusing the failure time data of congeneric equipment. Then, the prior information of the random coefficient is updated online under the Bayesian framework, the probability density function(PDF) of the RUL with considering the limitation of the failure threshold is performed. Finally, two case studies are used for experimental verification. Compared with the traditional Bayesian method, the proposed method can effectively reduce the influence of imperfect prior information and improve the accuracy of RUL prediction.
文摘BACKGROUND Hypertension is a major risk factor for cardiovascular disease and stroke,and its prevalence is increasing worldwide.Health education interventions based on the health belief model(HBM)can improve the knowledge,attitudes,and behaviors of patients with hypertension and help them control their blood pressure.AIM To evaluate the effects of health education interventions based on the HBM in patients with hypertension in China.METHODS Between 2021 and 2023,140 patients with hypertension were randomly assigned to either the intervention or control group.The intervention group received health education based on the HBM,including lectures,brochures,videos,and counseling sessions,whereas the control group received routine care.Outcomes were measured at baseline,three months,and six months after the intervention and included blood pressure,medication adherence,self-efficacy,and perceived benefits,barriers,susceptibility,and severity.RESULTS The intervention group had significantly lower systolic blood pressure[mean difference(MD):-8.2 mmHg,P<0.001]and diastolic blood pressure(MD:-5.1 mmHg,P=0.002)compared to the control group at six months.The intervention group also had higher medication adherence(MD:1.8,P<0.001),self-efficacy(MD:12.4,P<0.001),perceived benefits(MD:3.2,P<0.001),lower perceived barriers(MD:-2.6,P=0.001),higher perceived susceptibility(MD:2.8,P=0.002),and higher perceived severity(MD:3.1,P<0.001)than the control group at six months.CONCLUSION Health education interventions based on the HBM effectively improve blood pressure control and health beliefs in patients with hypertension and should be implemented in clinical practice and community settings.
基金supported by the grants from the Natural Science Foundation of Hubei Province(No.2020CFB780)the Fundamental Research Funds for the Central Universities(No.2017KFYXJJ020).
文摘Objective Body fluid mixtures are complex biological samples that frequently occur in crime scenes,and can provide important clues for criminal case analysis.DNA methylation assay has been applied in the identification of human body fluids,and has exhibited excellent performance in predicting single-source body fluids.The present study aims to develop a methylation SNaPshot multiplex system for body fluid identification,and accurately predict the mixture samples.In addition,the value of DNA methylation in the prediction of body fluid mixtures was further explored.Methods In the present study,420 samples of body fluid mixtures and 250 samples of single body fluids were tested using an optimized multiplex methylation system.Each kind of body fluid sample presented the specific methylation profiles of the 10 markers.Results Significant differences in methylation levels were observed between the mixtures and single body fluids.For all kinds of mixtures,the Spearman’s correlation analysis revealed a significantly strong correlation between the methylation levels and component proportions(1:20,1:10,1:5,1:1,5:1,10:1 and 20:1).Two random forest classification models were trained for the prediction of mixture types and the prediction of the mixture proportion of 2 components,based on the methylation levels of 10 markers.For the mixture prediction,Model-1 presented outstanding prediction accuracy,which reached up to 99.3%in 427 training samples,and had a remarkable accuracy of 100%in 243 independent test samples.For the mixture proportion prediction,Model-2 demonstrated an excellent accuracy of 98.8%in 252 training samples,and 98.2%in 168 independent test samples.The total prediction accuracy reached 99.3%for body fluid mixtures and 98.6%for the mixture proportions.Conclusion These results indicate the excellent capability and powerful value of the multiplex methylation system in the identification of forensic body fluid mixtures.
基金supported by the National Science Foundation of China(42107183).
文摘Driven piles are used in many geological environments as a practical and convenient structural component.Hence,the determination of the drivability of piles is actually of great importance in complex geotechnical applications.Conventional methods of predicting pile drivability often rely on simplified physicalmodels or empirical formulas,whichmay lack accuracy or applicability in complex geological conditions.Therefore,this study presents a practical machine learning approach,namely a Random Forest(RF)optimized by Bayesian Optimization(BO)and Particle Swarm Optimization(PSO),which not only enhances prediction accuracy but also better adapts to varying geological environments to predict the drivability parameters of piles(i.e.,maximumcompressive stress,maximum tensile stress,and blow per foot).In addition,support vector regression,extreme gradient boosting,k nearest neighbor,and decision tree are also used and applied for comparison purposes.In order to train and test these models,among the 4072 datasets collected with 17model inputs,3258 datasets were randomly selected for training,and the remaining 814 datasets were used for model testing.Lastly,the results of these models were compared and evaluated using two performance indices,i.e.,the root mean square error(RMSE)and the coefficient of determination(R2).The results indicate that the optimized RF model achieved lower RMSE than other prediction models in predicting the three parameters,specifically 0.044,0.438,and 0.146;and higher R^(2) values than other implemented techniques,specifically 0.966,0.884,and 0.977.In addition,the sensitivity and uncertainty of the optimized RF model were analyzed using Sobol sensitivity analysis and Monte Carlo(MC)simulation.It can be concluded that the optimized RF model could be used to predict the performance of the pile,and it may provide a useful reference for solving some problems under similar engineering conditions.
基金Cuiying Scientific and Technological Innovation Program of the Second Hospital,No.CY2021-BJ-A16 and No.CY2022-QN-A18Clinical Medical School of Lanzhou University and Lanzhou Science and Technology Development Guidance Plan Project,No.2023-ZD-85.
文摘BACKGROUND Liver cancer is one of the most prevalent malignant tumors worldwide,and its early detection and treatment are crucial for enhancing patient survival rates and quality of life.However,the early symptoms of liver cancer are often not obvious,resulting in a late-stage diagnosis in many patients,which significantly reduces the effectiveness of treatment.Developing a highly targeted,widely applicable,and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.AIM To develop a liver cancer risk prediction model by employing machine learning techniques,and subsequently assess its performance.METHODS In this study,a total of 550 patients were enrolled,with 190 hepatocellular carcinoma(HCC)and 195 cirrhosis patients serving as the training cohort,and 83 HCC and 82 cirrhosis patients forming the validation cohort.Logistic regression(LR),support vector machine(SVM),random forest(RF),and least absolute shrinkage and selection operator(LASSO)regression models were developed in the training cohort.Model performance was assessed in the validation cohort.Additionally,this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve,calibration curve,and decision curve analysis(DCA)to determine the optimal predictive model for assessing liver cancer risk.RESULTS Six variables including age,white blood cell,red blood cell,platelet counts,alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR,SVM,RF,and LASSO regression models.The RF model exhibited superior discrimination,and the area under curve of the training and validation sets was 0.969 and 0.858,respectively.These values significantly surpassed those of the LR(0.850 and 0.827),SVM(0.860 and 0.803),LASSO regression(0.845 and 0.831),and ASAP(0.866 and 0.813)models.Furthermore,calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.CONCLUSION The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.
基金Supported by the Science and Technology Plan of Suzhou City,No.SKY2021038.
文摘BACKGROUND Lymph node ratio(LNR)was demonstrated to play a crucial role in the prognosis of many tumors.However,research concerning the prognostic value of LNR in postoperative gastric neuroendocrine neoplasm(NEN)patients was limited.AIM To explore the prognostic value of LNR in postoperative gastric NEN patients and to combine LNR to develop prognostic models.METHODS A total of 286 patients from the Surveillance,Epidemiology,and End Results database were divided into the training set and validation set at a ratio of 8:2.92 patients from the First Affiliated Hospital of Soochow University in China were designated as a test set.Cox regression analysis was used to explore the relationship between LNR and disease-specific survival(DSS)of gastric NEN patients.Random survival forest(RSF)algorithm and Cox proportional hazards(CoxPH)analysis were applied to develop models to predict DSS respectively,and compared with the 8th edition American Joint Committee on Cancer(AJCC)tumornode-metastasis(TNM)staging.RESULTS Multivariate analyses indicated that LNR was an independent prognostic factor for postoperative gastric NEN patients and a higher LNR was accompanied by a higher risk of death.The RSF model exhibited the best performance in predicting DSS,with the C-index in the test set being 0.769[95%confidence interval(CI):0.691-0.846]outperforming the CoxPH model(0.744,95%CI:0.665-0.822)and the 8th edition AJCC TNM staging(0.723,95%CI:0.613-0.833).The calibration curves and decision curve analysis(DCA)demonstrated the RSF model had good calibration and clinical benefits.Furthermore,the RSF model could perform risk stratification and individual prognosis prediction effectively.CONCLUSION A higher LNR indicated a lower DSS in postoperative gastric NEN patients.The RSF model outperformed the CoxPH model and the 8th edition AJCC TNM staging in the test set,showing potential in clinical practice.
基金supported by the projects of the China Geological Survey(DD20221729,DD20190291)Zhuhai Urban Geological Survey(including informatization)(MZCD–2201–008).
文摘Machine learning is currently one of the research hotspots in the field of landslide prediction.To clarify and evaluate the differences in characteristics and prediction effects of different machine learning models,Conghua District,which is the most prone to landslide disasters in Guangzhou,was selected for landslide susceptibility evaluation.The evaluation factors were selected by using correlation analysis and variance expansion factor method.Applying four machine learning methods namely Logistic Regression(LR),Random Forest(RF),Support Vector Machines(SVM),and Extreme Gradient Boosting(XGB),landslide models were constructed.Comparative analysis and evaluation of the model were conducted through statistical indices and receiver operating characteristic(ROC)curves.The results showed that LR,RF,SVM,and XGB models have good predictive performance for landslide susceptibility,with the area under curve(AUC)values of 0.752,0.965,0.996,and 0.998,respectively.XGB model had the highest predictive ability,followed by RF model,SVM model,and LR model.The frequency ratio(FR)accuracy of LR,RF,SVM,and XGB models was 0.775,0.842,0.759,and 0.822,respectively.RF and XGB models were superior to LR and SVM models,indicating that the integrated algorithm has better predictive ability than a single classification algorithm in regional landslide classification problems.
文摘We estimate tree heights using polarimetric interferometric synthetic aperture radar(PolInSAR)data constructed by the dual-polarization(dual-pol)SAR data and random volume over the ground(RVoG)model.Considering the Sentinel-1 SAR dual-pol(SVV,vertically transmitted and vertically received and SVH,vertically transmitted and horizontally received)configuration,one notes that S_(HH),the horizontally transmitted and horizontally received scattering element,is unavailable.The S_(HH)data were constructed using the SVH data,and polarimetric SAR(PolSAR)data were obtained.The proposed approach was first verified in simulation with satisfactory results.It was next applied to construct PolInSAR data by a pair of dual-pol Sentinel-1A data at Duke Forest,North Carolina,USA.According to local observations and forest descriptions,the range of estimated tree heights was overall reasonable.Comparing the heights with the ICESat-2 tree heights at 23 sampling locations,relative errors of 5 points were within±30%.Errors of 8 points ranged from 30%to 40%,but errors of the remaining 10 points were>40%.The results should be encouraged as error reduction is possible.For instance,the construction of PolSAR data should not be limited to using SVH,and a combination of SVH and SVV should be explored.Also,an ensemble of tree heights derived from multiple PolInSAR data can be considered since tree heights do not vary much with time frame in months or one season.
文摘Tree height (H) in a natural stand or forest plantation is a fundamental variable in management, and the use of mathematical expressions that estimate H as a function of diameter at breast height (d) or variables at the stand level is a valuable support tool in forest inventories. The objective was to fit and propose a generalized H-d model for Pinus montezumae and Pinus pseudostrobus established in forest plantations of Nuevo San Juan Parangaricutiro, Michoacan, Mexico. Using nonlinear least squares (NLS), 10 generalized H-d models were fitted to 883 and 1226 pairs of H-d data from Pinus montezumae and Pinus pseudostrobus, respectively. The best model was refitted with the maximum likelihood mixed effects model (MEM) approach by including the site as a classification variable and a known variance structure. The Wang and Tang equation was selected as the best model with NLS;the MEM with an additive effect on two of its parameters and an exponential variance function improved the fit statistics for Pinus montezumae and Pinus pseudostrobus, respectively. The model validation showed equality of means among the estimates for both species and an independent subsample. The calibration of the MEM at the plot level was efficient and might increase the applicability of these results. The inclusion of dominant height in the MEM approach helped to reduce bias in the estimates and also to better explain the variability among plots.
文摘The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and duration features. When the tone model is integrated into continuous speech recognition, the discriminative model weight training (DMWT) is proposed. Acoustic and tone scores are scaled by model weights discriminatively trained by the minimum phone error (MPE) criterion. Two schemes of weight training are evaluated and a smoothing technique is used to make training robust to overtraining problem. Experiments show that the accuracies of tone recognition and large vocabulary continuous speech recognition (LVCSR) can be improved by the HCRFs based tone model. Compared with the global weight scheme, continuous speech recognition can be improved by the discriminative trained weight combinations.
基金supported by the NSFC(11171101)Doctoral Fund of Education Ministry of China(20104306110001)the Graduate Research and Innovation Fund of Hunan Province(CX2011B197)
文摘In this paper, a compound binomial model with a constant dividend barrier and random income is considered. Two types of individual claims, main claims and by-claims, are defined, where every by-claim is induced by the main claim and may be delayed for one time period with a certain probability. The premium income is assumed to another binomial process to capture the uncertainty of the customer's arrivals and payments. A system of difference equations with certain boundary conditions for the expected present value of total dividend payments prior to ruin is derived and solved. Explicit results are obtained when the claim sizes are Kn distributed or the claim size distributions have finite support. Numerical results are also provided to illustrate the impact of the delay of by-claims on the expected present value of dividends.
基金This work was supported in part by the National Natural Science Foundation of China(61601418,41602362,61871259)in part by the Opening Foundation of Hunan Engineering and Research Center of Natural Resource Investigation and Monitoring(2020-5)+1 种基金in part by the Qilian Mountain National Park Research Center(Qinghai)(grant number:GKQ2019-01)in part by the Geomatics Technology and Application Key Laboratory of Qinghai Province,Grant No.QHDX-2019-01.
文摘This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting decision tree(GBDT), random forest(RF) and information value(InV) models, were used, and the performances were assessed and compared. In total, 202 landslides were mapped by using a series of field surveys, aerial photographs, and reviews of historical and bibliographical data. Nine causative factors were then considered in landslide susceptibility map generation by using the GBDT, RF and InV models. All of the maps of the causative factors were resampled to a resolution of 28.5 m. Of the 486289 pixels in the area,28526 pixels were landslide pixels, and 457763 pixels were non-landslide pixels. Finally, landslide susceptibility maps were generated by using the three machine learning models, and their performances were assessed through receiver operating characteristic(ROC) curves, the sensitivity, specificity,overall accuracy(OA), and kappa coefficient(KAPPA). The results showed that the GBDT, RF and In V models in overall produced reasonable accurate landslide susceptibility maps. Among these three methods, the GBDT method outperforms the other two machine learning methods, which can provide strong technical support for producing landslide susceptibility maps in TGR.
文摘The dual random models about the life insurance and social pension insurance have received considerable attention in the recent articles on actuarial theory and applications. This paper discusses a general kind of increasing annuity based on its force of interest accumulation function as a general random process. The dual random model of the present value of the benefits of the increasing annuity has been set, and their moments have been calculated under certain conditions.
基金The research project supported by NSFC(1 9631 0 4 0 ) and NSFJ
文摘In this paper,a unified diagnostic method for the nonlinear models with random effects based upon the joint likelihood given by Robinson in 1991 is presented.It is shown that the case deletion model is equivalent to the mean shift outlier model.From this point of view,several diagnostic measures,such as Cook distance,score statistics are derived.The local influence measure of Cook is also presented. A numerical example illustrates that the method is available.
基金The project is partly supported by NSFC (19971085)the Doctoral Program Foundation of the Institute of High Education and the Special Foundation of Chinese Academy of Sciences.
文摘Bayes decision rule of variance components for one-way random effects model is derived and empirical Bayes (EB) decision rules are constructed by kernel estimation method. Under suitable conditions, it is shown that the proposed EB decision rules are asymptotically optimal with convergence rates near O(n-1/2). Finally, an example concerning the main result is given.
文摘According to the theoretical solutions for the nonlinear three-dimensional gravity surface waves and their interactions with vertical wall previously proposed by the lead author, in this paper an exact second-order random model of the unified wave motion process for nonlinear irregular waves and their interactions with vertical wall in uniform current is formulated, the corresponding theoretical nonlinear spectrum is derived, and the digital simulation model suitable to the use of the FFT (Fast Fourier Transform) algorithm is also given. Simulations of wave surface, wave pressure, total wave pressure and its moment are performed. The probability properties and statistical characteristics of these realizations are tested, which include the verifications of normality for linear process and of non-normality for nonlinear process; the consistencies of the theoretical spectra with simulated ones; the probability properties of apparent characteristics, such as amplitudes, periods, and extremes (maximum and minimum, positive and negative extremes). The statistical analysis and comparisons demonstrate that the proposed theoretical and computing models are realistic and effective, and estimated spectra are in good agreement with the theoretical ones, and the probability properties of the simulated waves are similar to those of the sea waves. At the same time, the simulating computation can be completed rapidly and easily.