Malware is an ever-present and dynamic threat to networks and computer systems in cybersecurity,and because of its complexity and evasiveness,it is challenging to identify using traditional signature-based detection a...Malware is an ever-present and dynamic threat to networks and computer systems in cybersecurity,and because of its complexity and evasiveness,it is challenging to identify using traditional signature-based detection approaches.The study article discusses the growing danger to cybersecurity that malware hidden in PDF files poses,highlighting the shortcomings of conventional detection techniques and the difficulties presented by adversarial methodologies.The article presents a new method that improves PDF virus detection by using document analysis and a Logistic Model Tree.Using a dataset from the Canadian Institute for Cybersecurity,a comparative analysis is carried out with well-known machine learning models,such as Credal Decision Tree,Naïve Bayes,Average One Dependency Estimator,Locally Weighted Learning,and Stochastic Gradient Descent.Beyond traditional structural and JavaScript-centric PDF analysis,the research makes a substantial contribution to the area by boosting precision and resilience in malware detection.The use of Logistic Model Tree,a thorough feature selection approach,and increased focus on PDF file attributes all contribute to the efficiency of PDF virus detection.The paper emphasizes Logistic Model Tree’s critical role in tackling increasing cybersecurity threats and proposes a viable answer to practical issues in the sector.The results reveal that the Logistic Model Tree is superior,with improved accuracy of 97.46%when compared to benchmark models,demonstrating its usefulness in addressing the ever-changing threat landscape.展开更多
The North China Plain and the agricultural region are crossed by the Shanxi-Beijing natural gas pipeline.Resi-dents in the area use rototillers for planting and harvesting;however,the depth of the rototillers into the...The North China Plain and the agricultural region are crossed by the Shanxi-Beijing natural gas pipeline.Resi-dents in the area use rototillers for planting and harvesting;however,the depth of the rototillers into the ground is greater than the depth of the pipeline,posing a significant threat to the safe operation of the pipeline.Therefore,it is of great significance to study the dynamic response of rotary tillers impacting pipelines to ensure the safe opera-tion of pipelines.This article focuses on the Shanxi-Beijing natural gas pipeline,utilizingfinite element simulation software to establish afinite element model for the interaction among the machinery,pipeline,and soil,and ana-lyzing the dynamic response of the pipeline.At the same time,a decision tree model is introduced to classify the damage of pipelines under different working conditions,and the boundary value and importance of each influen-cing factor on pipeline damage are derived.Considering the actual conditions in the hemp yam planting area,targeted management measures have been proposed to ensure the operational safety of the Shanxi-Beijing natural gas pipeline in this region.展开更多
BACKGROUND Development of distant metastasis(DM)is a major concern during treatment of nasopharyngeal carcinoma(NPC).However,studies have demonstrated im-proved distant control and survival in patients with advanced N...BACKGROUND Development of distant metastasis(DM)is a major concern during treatment of nasopharyngeal carcinoma(NPC).However,studies have demonstrated im-proved distant control and survival in patients with advanced NPC with the addition of chemotherapy to concomitant chemoradiotherapy.Therefore,precise prediction of metastasis in patients with NPC is crucial.AIM To develop a predictive model for metastasis in NPC using detailed magnetic resonance imaging(MRI)reports.METHODS This retrospective study included 792 patients with non-distant metastatic NPC.A total of 469 imaging variables were obtained from detailed MRI reports.Data were stratified and randomly split into training(50%)and testing sets.Gradient boosting tree(GBT)models were built and used to select variables for predicting DM.A full model comprising all variables and a reduced model with the top-five variables were built.Model performance was assessed by area under the curve(AUC).RESULTS Among the 792 patients,94 developed DM during follow-up.The number of metastatic cervical nodes(30.9%),tumor invasion in the posterior half of the nasal cavity(9.7%),two sides of the pharyngeal recess(6.2%),tubal torus(3.3%),and single side of the parapharyngeal space(2.7%)were the top-five contributors for predicting DM,based on their relative importance in GBT models.The testing AUC of the full model was 0.75(95%confidence interval[CI]:0.69-0.82).The testing AUC of the reduced model was 0.75(95%CI:0.68-0.82).For the whole dataset,the full(AUC=0.76,95%CI:0.72-0.82)and reduced models(AUC=0.76,95%CI:0.71-0.81)outperformed the tumor node-staging system(AUC=0.67,95%CI:0.61-0.73).CONCLUSION The GBT model outperformed the tumor node-staging system in predicting metastasis in NPC.The number of metastatic cervical nodes was identified as the principal contributing variable.展开更多
Objective: The progression of human cancer is characterized by the accumulation of genetic instability. An increasing number of experimental genetic molecular techniques have been used to detect chromosome aberration...Objective: The progression of human cancer is characterized by the accumulation of genetic instability. An increasing number of experimental genetic molecular techniques have been used to detect chromosome aberrations. Previous studies on chromosome abnormalities often focused on identifying the frequent loci of chromosome alterations, but rarely addressed the issue of interrelationship of chromosomal abnormalities. In the last few years, several mathematical models have been employed to construct models of carcinogenesis, in an attempt to identify the time order and cause-and-effect relationship of chromosome aberrations. The principles and applications of these models are reviewed and compared in this paper. Mathematical modeling of carcinogenesis can contribute to our understanding of the molecular genetics of tumor development, and identification of cancer related genes, thus leading to improved clinical practice of cancer.展开更多
With the increasing availability of precipitation radar data from space,enhancement of the resolution of spaceborne precipitation observations is important,particularly for hazard prediction and climate modeling at lo...With the increasing availability of precipitation radar data from space,enhancement of the resolution of spaceborne precipitation observations is important,particularly for hazard prediction and climate modeling at local scales relevant to extreme precipitation intensities and gradients.In this paper,the statistical characteristics of radar precipitation reflectivity data are studied and modeled using a hidden Markov tree(HMT)in the wavelet domain.Then,a high-resolution interpolation algorithm is proposed for spaceborne radar reflectivity using the HMT model as prior information.Owing to the small and transient storm elements embedded in the larger and slowly varying elements,the radar precipitation data exhibit distinct multiscale statistical properties,including a non-Gaussian structure and scale-to-scale dependency.An HMT model can capture well the statistical properties of radar precipitation,where the wavelet coefficients in each sub-band are characterized as a Gaussian mixture model(GMM),and the wavelet coefficients from the coarse scale to fine scale are described using a multiscale Markov process.The state probabilities of the GMM are determined using the expectation maximization method,and other parameters,for instance,the variance decay parameters in the HMT model are learned and estimated from high-resolution ground radar reflectivity images.Using the prior model,the wavelet coefficients at finer scales are estimated using local Wiener filtering.The interpolation algorithm is validated using data from the precipitation radar onboard the Tropical Rainfall Measurement Mission satellite,and the reconstructed results are found to be able to enhance the spatial resolution while optimally reproducing the local extremes and gradients.展开更多
Tree mortality plays a fundamental role in the dynamics of forest ecosystems,yet it is one of the most difficult phenomena to accurately predict.Various modeling strategies have been developed to improve individual tr...Tree mortality plays a fundamental role in the dynamics of forest ecosystems,yet it is one of the most difficult phenomena to accurately predict.Various modeling strategies have been developed to improve individual tree mortality predictions.One less explored strategy is the use of a multistage modeling approach.Potential improvements from this approach have remained largely unknown.In this study,we developed a novel multistage approach and compared its performance in individual tree mortality predictions with a more conventional approach using an identical individual tree mortality model formulation.Extensive permanent plot data(n=9442)covering the Acadian Region of North America and over multiple decades(1965–2014)were used in this study.Our results indicated that the model behavior with the multistage approach better depicted the observed mortality and showed a notable improvement over the conventional approach.The difference between the observed and predicted numbers of dead trees using the multistage approach was much smaller when compared with the conventional approach.In addition,tree survival probabilities predicted by the multistage approach generally were not significantly different from the observations,whereas the conventional approach consistently underestimated mortality across species and overestimated tree survival probabilities over the large range of DBH in the data.The new multistage approach also predictions of zero mortality in individual plots,a result not possible in conventional models.Finally,the new approach was more tolerant of modeling errors because it based estimates on ranked tree mortality rather than error-prone predicted values.Overall,this new multistage approach deserves to be considered and tested in future studies.展开更多
Digital aerial photograph(DAP)data is processed based on Structure from Motion(Sf M)algorithm and regional net adjustment method to generate digital surface discrete point clouds similar to Light Detection and Ranging...Digital aerial photograph(DAP)data is processed based on Structure from Motion(Sf M)algorithm and regional net adjustment method to generate digital surface discrete point clouds similar to Light Detection and Ranging(LiDAR)and digital orthophoto mosaic(DOM)similar to optical remote sensing image.In this study,we obtained highresolution images of mature forests of Chinese fir by unmanned aerial vehicle(UAV)flying through crossroute flight,and then reconstructed the threedimensional point clouds in the UAV aerial area by SfM technique.The point cloud segmentation(PCS)algorithm was used for the individual tree segmentation,and the F-score of the three sample plots were 0.91,0.94,and 0.94,respectively.Individual tree biomass modeling was conducted using 155 mature Chinese fir forests which were correctly segmented.The relative root mean squared error(rRMSE)values of random forest(RF),bagged tree(BT)and support vector regression(SVR)were 34.48%,35.74%and 40.93%,respectively.Our study demonstrated that DAP point clouds had great potential to extract forest vertical parameters and could be applied successfully in individual tree segmentation and individual tree biomass modeling.展开更多
The ongoing effort to create methods for detecting and quantifying fatigue damage is motivated by the high levels of uncertainty in present fatigue-life prediction approaches and the frequently catastrophic nature of ...The ongoing effort to create methods for detecting and quantifying fatigue damage is motivated by the high levels of uncertainty in present fatigue-life prediction approaches and the frequently catastrophic nature of fatigue failure.The fatigue life of high strength aluminum alloy 2090-T83 is predicted in this study using a variety of artificial intelligence and machine learning techniques for constant amplitude and negative stress ratios(R?1).Artificial neural networks(ANN),adaptive neuro-fuzzy inference systems(ANFIS),support-vector machines(SVM),a random forest model(RF),and an extreme-gradient tree-boosting model(XGB)are trained using numerical and experimental input data obtained from fatigue tests based on a relatively low number of stress measurements.In particular,the coefficients of the traditional force law formula are found using relevant numerical methods.It is shown that,in comparison to traditional approaches,the neural network and neuro-fuzzy models produce better results,with the neural network models trained using the boosting iterations technique providing the best performances.Building strong models from weak models,XGB helps to predict fatigue life by reducing model partiality and variation in supervised learning.Fuzzy neural models can be used to predict the fatigue life of alloys more accurately than neural networks and traditional methods.展开更多
The dead fuel moisture content(DFMC)is the key driver leading to fire occurrence.Accurately estimating the DFMC could help identify locations facing fire risks,prioritise areas for fire monitoring,and facilitate timel...The dead fuel moisture content(DFMC)is the key driver leading to fire occurrence.Accurately estimating the DFMC could help identify locations facing fire risks,prioritise areas for fire monitoring,and facilitate timely deployment of fire-suppression resources.In this study,the DFMC and environmental variables,including air temperature,relative humidity,wind speed,solar radiation,rainfall,atmospheric pressure,soil temperature,and soil humidity,were simultaneously measured in a grassland of Ergun City,Inner Mongolia Autonomous Region of China in 2021.We chose three regression models,i.e.,random forest(RF)model,extreme gradient boosting(XGB)model,and boosted regression tree(BRT)model,to model the seasonal DFMC according to the data collected.To ensure accuracy,we added time-lag variables of 3 d to the models.The results showed that the RF model had the best fitting effect with an R2value of 0.847 and a prediction accuracy with a mean absolute error score of 4.764%among the three models.The accuracies of the models in spring and autumn were higher than those in the other two seasons.In addition,different seasons had different key influencing factors,and the degree of influence of these factors on the DFMC changed with time lags.Moreover,time-lag variables within 44 h clearly improved the fitting effect and prediction accuracy,indicating that environmental conditions within approximately 48 h greatly influence the DFMC.This study highlights the importance of considering 48 h time-lagged variables when predicting the DFMC of grassland fuels and mapping grassland fire risks based on the DFMC to help locate high-priority areas for grassland fire monitoring and prevention.展开更多
Gully erosion is a disruptive phenomenon which extensively affects the Iranian territory,especially in the Northern provinces.A number of studies have been recently undertaken to study this process and to predict it o...Gully erosion is a disruptive phenomenon which extensively affects the Iranian territory,especially in the Northern provinces.A number of studies have been recently undertaken to study this process and to predict it over space and ultimately,in a broader national effort,to limit its negative effects on local communities.We focused on the Bastam watershed where 9.3%of its surface is currently affected by gullying.Machine learning algorithms are currently under the magnifying glass across the geomorphological community for their high predictive ability.However,unlike the bivariate statistical models,their structure does not provide intuitive and quantifiable measures of environmental preconditioning factors.To cope with such weakness,we interpret preconditioning causes on the basis of a bivariate approach namely,Index of Entropy.And,we performed the susceptibility mapping procedure by testing three extensions of a decision tree model namely,Alternating Decision Tree(ADTree),Naive-Bayes tree(NBTree),and Logistic Model Tree(LMT).We dichotomized the gully information over space into gully presence/absence conditions,which we further explored in their calibration and validation stages.Being the presence/absence information and associated factors identical,the resulting differences are only due to the algorithmic structures of the three models we chose.Such differences are not significant in terms of performances;in fact,the three models produce outstanding predictive AUC measures(ADTree=0.922;NBTree=0.939;LMT=0.944).However,the associated mapping results depict very different patterns where only the LMT is associated with reasonable susceptibility patterns.This is a strong indication of what model combines best performance and mapping for any natural hazard-oriented application.展开更多
Creating realistic 3D tree models in a convenient way is a challenge in game design and movie making due to diversification and occlusion of tree structures. Current sketch-based and imagebased approaches for fast tre...Creating realistic 3D tree models in a convenient way is a challenge in game design and movie making due to diversification and occlusion of tree structures. Current sketch-based and imagebased approaches for fast tree modeling have limitations in effect and speed, and they generally include complex parameter adjustment, which brings difficulties to novices. In this paper, we present a simple method for quickly generating various 3D tree models from freehand sketches without parameter adjustment. On two input images, the user draws strokes representing the main branches and crown silhouettes of a tree. The system automatically produces a 3D tree at high speed. First, two 2D skeletons are built from strokes, and a 3D tree structure resembling the input sketches is built by branch retrieval from the 2D skeletons. Small branches are generated within the sketched 2D crown silhouettes based on self-similarity and angle restriction. This system is demonstrated on a variety of examples. It maintains the main features of a tree: the main branch structure and crown shape, and can be used as a convenient tool for tree simulation and design.展开更多
It is important for regional water resources management to know the agricultural water consumption information several months in advance.Forecasting reference evapotranspiration(ET_(0))in the next few months is import...It is important for regional water resources management to know the agricultural water consumption information several months in advance.Forecasting reference evapotranspiration(ET_(0))in the next few months is important for irrigation and reservoir management.Studies on forecasting of multiple-month ahead ET_(0) using machine learning models have not been reported yet.Besides,machine learning models such as the XGBoost model has multiple parameters that need to be tuned,and traditional methods can get stuck in a regional optimal solution and fail to obtain a global optimal solution.This study investigated the performance of the hybrid extreme gradient boosting(XGBoost)model coupled with the Grey Wolf Optimizer(GWO)algorithm for forecasting multi-step ahead ET_(0)(1-3 months ahead),compared with three conventional machine learning models,i.e.,standalone XGBoost,multi-layer perceptron(MLP)and M5 model tree(M5)models in the subtropical zone of China.The results showed that theGWO-XGB model generally performed better than the other three machine learning models in forecasting 1-3 months ahead ET_(0),followed by the XGB,M5 and MLP models with very small differences among the three models.The GWO-XGB model performed best in autumn,while the MLP model performed slightly better than the other three models in summer.It is thus suggested to apply the MLP model for ET_(0) forecasting in summer but use the GWO-XGB model in other seasons.展开更多
Tyre pressure monitoring system(TPMS)is compulsory in most countries like the United States and European Union.The existing systems depend on pressure sensors strapped on the tyre or on wheel speed sensor data.A diffe...Tyre pressure monitoring system(TPMS)is compulsory in most countries like the United States and European Union.The existing systems depend on pressure sensors strapped on the tyre or on wheel speed sensor data.A difference in wheel speed would trigger an alarm based on the algorithm implemented.In this paper,machine learning approach is proposed as a new method to monitor tyre pressure by extracting the vertical vibrations from a wheel hub of a moving vehicle using an accelerometer.The obtained signals will be used to compute through statistical features and histogram features for the feature extraction process.The LMT(Logistic Model Tree)was used as the classifier and attained a classification accuracy of 92.5%with 10-fold cross validation for statistical features and 90.5% with 10-fold cross validation for histogram features.The proposed model can be used for monitoring the automobile tyre pressure successfully.展开更多
With the rapid development and widespread application of Wireless Body Area Networks(WBANs),the traditional centralized system architecture cannot handle the massive data generated by the edge devices.Meanwhile,in ord...With the rapid development and widespread application of Wireless Body Area Networks(WBANs),the traditional centralized system architecture cannot handle the massive data generated by the edge devices.Meanwhile,in order to ensure the security of physiological privacy data and the identity privacy of patients,this paper presents a privacy protection strategy for Mobile Edge Computing(MEC)enhanced WBANs,which leverages the blockchain-based decentralized MEC paradigm to support efficient transmission of privacy information with low latency,high reliability within a high-demand data security scenario.On this basis,the Merkle tree optimization model is designed to authenticate nodes and to verify the source of physiological data.Furthermore,a hybrid signature algorithm is devised to guarantee the node anonymity with unforgeability,data integrity and reduced delay.The security performance analysis and simulation results show that our proposed strategy not only reduces the delay,but also secures the privacy and transmission of sensitive WBANs data.展开更多
This study was conducted to evaluate the performance of six stem taper models on four tropical tree species, namely Celtis luzonica(Magabuyo),Diplodiscus paniculatus(Balobo), Parashorea malaanonan(Bagtikan), and Swiet...This study was conducted to evaluate the performance of six stem taper models on four tropical tree species, namely Celtis luzonica(Magabuyo),Diplodiscus paniculatus(Balobo), Parashorea malaanonan(Bagtikan), and Swietenia macrophylla(Mahogany) in Mount Makiling Forest Reserve(MMFR), Philippines using fit statistics and lack-of-fit statistics. Four statistical criteria were used in this study, including the standard error of estimate(SEE),coefficient of determination(R^2), mean bias( E),and absolute mean difference(AMD). For the lack-offit statistics, SEE, E and AMD were determined in different relative height classes. The results indicated that the Kozak02 stem taper model offered the best fit for the four tropical species in most statistics. The Kozak02 model also consistently provided the best performance in the lack-of-fit statistics with the best SEE, E and AMD in most of the relative height classes. These stem taper equations could help forest managers and researchers better estimate the diameter of the outside bark with any given height,merchantable stem volumes and total stem volumes of standing trees belonging to the four species of thetropical forest in MMFR.展开更多
BACKGROUND With the aging world population,the incidence of falls has intensified and fallrelated hospitalization costs are increasing.Falls are one type of event studied in the health economics of patient safety,and ...BACKGROUND With the aging world population,the incidence of falls has intensified and fallrelated hospitalization costs are increasing.Falls are one type of event studied in the health economics of patient safety,and many developed countries have conducted such research on fall-related hospitalization costs.However,China,a developing country,still lacks large-scale studies in this area.AIM To investigate the factors related to the hospitalization costs of fall-related injuries in elderly inpatients and establish factor-based,cost-related groupings.METHODS A retrospective study was conducted.Patient information and cost data for elderly inpatients(age≥60 years,n=3362)who were hospitalized between 2016 and 2019 due to falls was collected from the medical record systems of two grade-A tertiary hospitals in China.Quantile regression(QR)analysis was used to identify the factors related to fall-related hospitalization costs.A decision tree model based on the chi-squared automatic interaction detector algorithm for hospitalization cost grouping was built by setting the factors in the regression results as separation nodes.RESULTS The total hospitalization cost of fall-related injuries in the included elderly patients was 180479203.03 RMB,and the reimbursement rate of medical benefit funds was 51.0%(92039709.52 RMB/180479203.03 RMB).The medical material costs were the highest component of the total hospitalization cost,followed(in order)by drug costs,test costs,treatment costs,integrated medical service costs and blood transfusion costs The QR results showed that patient age,gender,length of hospital stay,payment method,wound position,wound type,operation times and operation type significantly influenced the inpatient cost(P<0.05).The cost grouping model was established based on the QR results,and age,length of stay,operation type,wound position and wound type were the most important influencing factors in the model.Furthermore,the cost of each combination varied significantly.CONCLUSION Our grouping model of hospitalization costs clearly reflected the key factors affecting hospitalization costs and can be used to strengthen the reasonable control of these costs.展开更多
In the design process of berm breakwaters, their front slope recession has an inevitable rule in large number of model tests, and this parameter being studied. This research draws its data from Moghim's and Shekari'...In the design process of berm breakwaters, their front slope recession has an inevitable rule in large number of model tests, and this parameter being studied. This research draws its data from Moghim's and Shekari's experiment results. These experiments consist of two different 2D model tests in two wave flumes, in which the berm recession to different sea state and structural parameters have been studied. Irregular waves with a JONSWAP spectrum were used in both test series. A total of 412 test results were used to cover the impact of sea state conditions such as wave height, wave period, storm duration and water depth at the toe of the structure, and structural parameters such as berm elevation from still water level, berm width and stone diameter on berm recession parameters. In this paper, a new set of equations for berm recession is derived using the M5' model tree as a machine learning approach. A comparison is made between the estimations by the new formula and the formulae recently given by other researchers to show the preference of new M5' approach.展开更多
AIM: To construct tree models for classification of diffuse large B-cell lymphomas (DLBCL) by chromosome copy numbers, to compare them with cDNA microarray classification, and to explore models of multi-gene, multi-st...AIM: To construct tree models for classification of diffuse large B-cell lymphomas (DLBCL) by chromosome copy numbers, to compare them with cDNA microarray classification, and to explore models of multi-gene, multi-step and multi-pathway processes of DLBCL tumorigenesis. METHODS: Maximum-weight branching and distancebased models were constructed based on the comparative genomic hybridization (CGH) data of 123 DLBCL samples using the established methods and software of Desper et al . A maximum likelihood tree model was also used to analyze the data. By comparing with the results reported in literature, values of tree models in the classification of DLBCL were elucidated. RESULTS: Both the branching and the distance-based trees classified DLBCL into three groups. We combined the classification methods of the two models and classified DLBCL into three categories according to their characteristics. The first group was marked by +Xq, +Xp, -17p and +13q; the second group by +3q, +18q and +18p; and the third group was marked by -6q and +6p. This chromosomal classification was consistent with cDNA classification. It indicated that -6q and +3q were two main events in the tumorigenesis of lymphoma. CONCLUSION: Tree models of lymphoma established from CGH data can be used in the classification of DLBCL. These models can suggest multi-gene, multistep and multi-pathway processes of tumorigenesis. Two pathways, -6q preceding +6q and +3q preceding+18q, may be important in understanding tumorigenesis of DLBCL. The pathway, -6q preceding +6q, may have a close relationship with the tumorigenesis of non-GCB DLBCL.展开更多
Using Bayesian networks to model promising solutions from the current population of the evolutionary algorithms can ensure efficiency and intelligence search for the optimum. However, to construct a Bayesian network t...Using Bayesian networks to model promising solutions from the current population of the evolutionary algorithms can ensure efficiency and intelligence search for the optimum. However, to construct a Bayesian network that fits a given dataset is a NP-hard problem, and it also needs consuming mass computational resources. This paper develops a methodology for constructing a graphical model based on Bayesian Dirichlet metric. Our approach is derived from a set of propositions and theorems by researching the local metric relationship of networks matching dataset. This paper presents the algorithm to construct a tree model from a set of potential solutions using above approach. This method is important not only for evolutionary algorithms based on graphical models, but also for machine learning and data mining. The experimental results show that the exact theoretical results and the approximations match very well.展开更多
基金This research work was funded by Institutional Fund Projects under Grant No.(IFPIP:211-611-1443).
文摘Malware is an ever-present and dynamic threat to networks and computer systems in cybersecurity,and because of its complexity and evasiveness,it is challenging to identify using traditional signature-based detection approaches.The study article discusses the growing danger to cybersecurity that malware hidden in PDF files poses,highlighting the shortcomings of conventional detection techniques and the difficulties presented by adversarial methodologies.The article presents a new method that improves PDF virus detection by using document analysis and a Logistic Model Tree.Using a dataset from the Canadian Institute for Cybersecurity,a comparative analysis is carried out with well-known machine learning models,such as Credal Decision Tree,Naïve Bayes,Average One Dependency Estimator,Locally Weighted Learning,and Stochastic Gradient Descent.Beyond traditional structural and JavaScript-centric PDF analysis,the research makes a substantial contribution to the area by boosting precision and resilience in malware detection.The use of Logistic Model Tree,a thorough feature selection approach,and increased focus on PDF file attributes all contribute to the efficiency of PDF virus detection.The paper emphasizes Logistic Model Tree’s critical role in tackling increasing cybersecurity threats and proposes a viable answer to practical issues in the sector.The results reveal that the Logistic Model Tree is superior,with improved accuracy of 97.46%when compared to benchmark models,demonstrating its usefulness in addressing the ever-changing threat landscape.
文摘The North China Plain and the agricultural region are crossed by the Shanxi-Beijing natural gas pipeline.Resi-dents in the area use rototillers for planting and harvesting;however,the depth of the rototillers into the ground is greater than the depth of the pipeline,posing a significant threat to the safe operation of the pipeline.Therefore,it is of great significance to study the dynamic response of rotary tillers impacting pipelines to ensure the safe opera-tion of pipelines.This article focuses on the Shanxi-Beijing natural gas pipeline,utilizingfinite element simulation software to establish afinite element model for the interaction among the machinery,pipeline,and soil,and ana-lyzing the dynamic response of the pipeline.At the same time,a decision tree model is introduced to classify the damage of pipelines under different working conditions,and the boundary value and importance of each influen-cing factor on pipeline damage are derived.Considering the actual conditions in the hemp yam planting area,targeted management measures have been proposed to ensure the operational safety of the Shanxi-Beijing natural gas pipeline in this region.
文摘BACKGROUND Development of distant metastasis(DM)is a major concern during treatment of nasopharyngeal carcinoma(NPC).However,studies have demonstrated im-proved distant control and survival in patients with advanced NPC with the addition of chemotherapy to concomitant chemoradiotherapy.Therefore,precise prediction of metastasis in patients with NPC is crucial.AIM To develop a predictive model for metastasis in NPC using detailed magnetic resonance imaging(MRI)reports.METHODS This retrospective study included 792 patients with non-distant metastatic NPC.A total of 469 imaging variables were obtained from detailed MRI reports.Data were stratified and randomly split into training(50%)and testing sets.Gradient boosting tree(GBT)models were built and used to select variables for predicting DM.A full model comprising all variables and a reduced model with the top-five variables were built.Model performance was assessed by area under the curve(AUC).RESULTS Among the 792 patients,94 developed DM during follow-up.The number of metastatic cervical nodes(30.9%),tumor invasion in the posterior half of the nasal cavity(9.7%),two sides of the pharyngeal recess(6.2%),tubal torus(3.3%),and single side of the parapharyngeal space(2.7%)were the top-five contributors for predicting DM,based on their relative importance in GBT models.The testing AUC of the full model was 0.75(95%confidence interval[CI]:0.69-0.82).The testing AUC of the reduced model was 0.75(95%CI:0.68-0.82).For the whole dataset,the full(AUC=0.76,95%CI:0.72-0.82)and reduced models(AUC=0.76,95%CI:0.71-0.81)outperformed the tumor node-staging system(AUC=0.67,95%CI:0.61-0.73).CONCLUSION The GBT model outperformed the tumor node-staging system in predicting metastasis in NPC.The number of metastatic cervical nodes was identified as the principal contributing variable.
基金supported by a grant from the Education Department of Zhejiang Province (No.Y200803235)
文摘Objective: The progression of human cancer is characterized by the accumulation of genetic instability. An increasing number of experimental genetic molecular techniques have been used to detect chromosome aberrations. Previous studies on chromosome abnormalities often focused on identifying the frequent loci of chromosome alterations, but rarely addressed the issue of interrelationship of chromosomal abnormalities. In the last few years, several mathematical models have been employed to construct models of carcinogenesis, in an attempt to identify the time order and cause-and-effect relationship of chromosome aberrations. The principles and applications of these models are reviewed and compared in this paper. Mathematical modeling of carcinogenesis can contribute to our understanding of the molecular genetics of tumor development, and identification of cancer related genes, thus leading to improved clinical practice of cancer.
基金This study was funded by the National Natural Science Foundation of China(Grant No.41975027)the Natural Science Foundation of Jiangsu Province(Grant No.BK20171457)the National Key R&D Program on Monitoring,Early Warning and Prevention of Major Natural Disasters(Grant No.2017YFC1501401).
文摘With the increasing availability of precipitation radar data from space,enhancement of the resolution of spaceborne precipitation observations is important,particularly for hazard prediction and climate modeling at local scales relevant to extreme precipitation intensities and gradients.In this paper,the statistical characteristics of radar precipitation reflectivity data are studied and modeled using a hidden Markov tree(HMT)in the wavelet domain.Then,a high-resolution interpolation algorithm is proposed for spaceborne radar reflectivity using the HMT model as prior information.Owing to the small and transient storm elements embedded in the larger and slowly varying elements,the radar precipitation data exhibit distinct multiscale statistical properties,including a non-Gaussian structure and scale-to-scale dependency.An HMT model can capture well the statistical properties of radar precipitation,where the wavelet coefficients in each sub-band are characterized as a Gaussian mixture model(GMM),and the wavelet coefficients from the coarse scale to fine scale are described using a multiscale Markov process.The state probabilities of the GMM are determined using the expectation maximization method,and other parameters,for instance,the variance decay parameters in the HMT model are learned and estimated from high-resolution ground radar reflectivity images.Using the prior model,the wavelet coefficients at finer scales are estimated using local Wiener filtering.The interpolation algorithm is validated using data from the precipitation radar onboard the Tropical Rainfall Measurement Mission satellite,and the reconstructed results are found to be able to enhance the spatial resolution while optimally reproducing the local extremes and gradients.
基金provided by National Science Foundation Center for Advanced Forestry Systems(CAFSAward#1915078)RII Track-2FEC(Award#1920908)。
文摘Tree mortality plays a fundamental role in the dynamics of forest ecosystems,yet it is one of the most difficult phenomena to accurately predict.Various modeling strategies have been developed to improve individual tree mortality predictions.One less explored strategy is the use of a multistage modeling approach.Potential improvements from this approach have remained largely unknown.In this study,we developed a novel multistage approach and compared its performance in individual tree mortality predictions with a more conventional approach using an identical individual tree mortality model formulation.Extensive permanent plot data(n=9442)covering the Acadian Region of North America and over multiple decades(1965–2014)were used in this study.Our results indicated that the model behavior with the multistage approach better depicted the observed mortality and showed a notable improvement over the conventional approach.The difference between the observed and predicted numbers of dead trees using the multistage approach was much smaller when compared with the conventional approach.In addition,tree survival probabilities predicted by the multistage approach generally were not significantly different from the observations,whereas the conventional approach consistently underestimated mortality across species and overestimated tree survival probabilities over the large range of DBH in the data.The new multistage approach also predictions of zero mortality in individual plots,a result not possible in conventional models.Finally,the new approach was more tolerant of modeling errors because it based estimates on ranked tree mortality rather than error-prone predicted values.Overall,this new multistage approach deserves to be considered and tested in future studies.
基金grants from the National Natural Science Foundation of China(No.31870620)the Fundamental Research Funds for the Central Universities(No.PTYX202107)the National Technology Extension Fund of Forestry([2019]06)。
文摘Digital aerial photograph(DAP)data is processed based on Structure from Motion(Sf M)algorithm and regional net adjustment method to generate digital surface discrete point clouds similar to Light Detection and Ranging(LiDAR)and digital orthophoto mosaic(DOM)similar to optical remote sensing image.In this study,we obtained highresolution images of mature forests of Chinese fir by unmanned aerial vehicle(UAV)flying through crossroute flight,and then reconstructed the threedimensional point clouds in the UAV aerial area by SfM technique.The point cloud segmentation(PCS)algorithm was used for the individual tree segmentation,and the F-score of the three sample plots were 0.91,0.94,and 0.94,respectively.Individual tree biomass modeling was conducted using 155 mature Chinese fir forests which were correctly segmented.The relative root mean squared error(rRMSE)values of random forest(RF),bagged tree(BT)and support vector regression(SVR)were 34.48%,35.74%and 40.93%,respectively.Our study demonstrated that DAP point clouds had great potential to extract forest vertical parameters and could be applied successfully in individual tree segmentation and individual tree biomass modeling.
文摘The ongoing effort to create methods for detecting and quantifying fatigue damage is motivated by the high levels of uncertainty in present fatigue-life prediction approaches and the frequently catastrophic nature of fatigue failure.The fatigue life of high strength aluminum alloy 2090-T83 is predicted in this study using a variety of artificial intelligence and machine learning techniques for constant amplitude and negative stress ratios(R?1).Artificial neural networks(ANN),adaptive neuro-fuzzy inference systems(ANFIS),support-vector machines(SVM),a random forest model(RF),and an extreme-gradient tree-boosting model(XGB)are trained using numerical and experimental input data obtained from fatigue tests based on a relatively low number of stress measurements.In particular,the coefficients of the traditional force law formula are found using relevant numerical methods.It is shown that,in comparison to traditional approaches,the neural network and neuro-fuzzy models produce better results,with the neural network models trained using the boosting iterations technique providing the best performances.Building strong models from weak models,XGB helps to predict fatigue life by reducing model partiality and variation in supervised learning.Fuzzy neural models can be used to predict the fatigue life of alloys more accurately than neural networks and traditional methods.
基金funded by the National Key Research and Development Program of China Strategic International Cooperation in Science and Technology Innovation Program (2018YFE0207800)the National Natural Science Foundation of China (31971483)。
文摘The dead fuel moisture content(DFMC)is the key driver leading to fire occurrence.Accurately estimating the DFMC could help identify locations facing fire risks,prioritise areas for fire monitoring,and facilitate timely deployment of fire-suppression resources.In this study,the DFMC and environmental variables,including air temperature,relative humidity,wind speed,solar radiation,rainfall,atmospheric pressure,soil temperature,and soil humidity,were simultaneously measured in a grassland of Ergun City,Inner Mongolia Autonomous Region of China in 2021.We chose three regression models,i.e.,random forest(RF)model,extreme gradient boosting(XGB)model,and boosted regression tree(BRT)model,to model the seasonal DFMC according to the data collected.To ensure accuracy,we added time-lag variables of 3 d to the models.The results showed that the RF model had the best fitting effect with an R2value of 0.847 and a prediction accuracy with a mean absolute error score of 4.764%among the three models.The accuracies of the models in spring and autumn were higher than those in the other two seasons.In addition,different seasons had different key influencing factors,and the degree of influence of these factors on the DFMC changed with time lags.Moreover,time-lag variables within 44 h clearly improved the fitting effect and prediction accuracy,indicating that environmental conditions within approximately 48 h greatly influence the DFMC.This study highlights the importance of considering 48 h time-lagged variables when predicting the DFMC of grassland fuels and mapping grassland fire risks based on the DFMC to help locate high-priority areas for grassland fire monitoring and prevention.
文摘Gully erosion is a disruptive phenomenon which extensively affects the Iranian territory,especially in the Northern provinces.A number of studies have been recently undertaken to study this process and to predict it over space and ultimately,in a broader national effort,to limit its negative effects on local communities.We focused on the Bastam watershed where 9.3%of its surface is currently affected by gullying.Machine learning algorithms are currently under the magnifying glass across the geomorphological community for their high predictive ability.However,unlike the bivariate statistical models,their structure does not provide intuitive and quantifiable measures of environmental preconditioning factors.To cope with such weakness,we interpret preconditioning causes on the basis of a bivariate approach namely,Index of Entropy.And,we performed the susceptibility mapping procedure by testing three extensions of a decision tree model namely,Alternating Decision Tree(ADTree),Naive-Bayes tree(NBTree),and Logistic Model Tree(LMT).We dichotomized the gully information over space into gully presence/absence conditions,which we further explored in their calibration and validation stages.Being the presence/absence information and associated factors identical,the resulting differences are only due to the algorithmic structures of the three models we chose.Such differences are not significant in terms of performances;in fact,the three models produce outstanding predictive AUC measures(ADTree=0.922;NBTree=0.939;LMT=0.944).However,the associated mapping results depict very different patterns where only the LMT is associated with reasonable susceptibility patterns.This is a strong indication of what model combines best performance and mapping for any natural hazard-oriented application.
基金Acknowledgements This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 60970093, 60902078, 6117210, and 61072151) by Natural Science Foundation of Beijing (4112061), and by the Scientific Research Foundation for the Returned Overseas Chinese Scholars of State Education Ministry of China.
文摘Creating realistic 3D tree models in a convenient way is a challenge in game design and movie making due to diversification and occlusion of tree structures. Current sketch-based and imagebased approaches for fast tree modeling have limitations in effect and speed, and they generally include complex parameter adjustment, which brings difficulties to novices. In this paper, we present a simple method for quickly generating various 3D tree models from freehand sketches without parameter adjustment. On two input images, the user draws strokes representing the main branches and crown silhouettes of a tree. The system automatically produces a 3D tree at high speed. First, two 2D skeletons are built from strokes, and a 3D tree structure resembling the input sketches is built by branch retrieval from the 2D skeletons. Small branches are generated within the sketched 2D crown silhouettes based on self-similarity and angle restriction. This system is demonstrated on a variety of examples. It maintains the main features of a tree: the main branch structure and crown shape, and can be used as a convenient tool for tree simulation and design.
基金This study was jointly supported by the National Natural Science Foundation of China(Nos.51879196,51790533,51709143)Jiangxi Natural Science Foundation of China(No.20181BAB206045).
文摘It is important for regional water resources management to know the agricultural water consumption information several months in advance.Forecasting reference evapotranspiration(ET_(0))in the next few months is important for irrigation and reservoir management.Studies on forecasting of multiple-month ahead ET_(0) using machine learning models have not been reported yet.Besides,machine learning models such as the XGBoost model has multiple parameters that need to be tuned,and traditional methods can get stuck in a regional optimal solution and fail to obtain a global optimal solution.This study investigated the performance of the hybrid extreme gradient boosting(XGBoost)model coupled with the Grey Wolf Optimizer(GWO)algorithm for forecasting multi-step ahead ET_(0)(1-3 months ahead),compared with three conventional machine learning models,i.e.,standalone XGBoost,multi-layer perceptron(MLP)and M5 model tree(M5)models in the subtropical zone of China.The results showed that theGWO-XGB model generally performed better than the other three machine learning models in forecasting 1-3 months ahead ET_(0),followed by the XGB,M5 and MLP models with very small differences among the three models.The GWO-XGB model performed best in autumn,while the MLP model performed slightly better than the other three models in summer.It is thus suggested to apply the MLP model for ET_(0) forecasting in summer but use the GWO-XGB model in other seasons.
文摘Tyre pressure monitoring system(TPMS)is compulsory in most countries like the United States and European Union.The existing systems depend on pressure sensors strapped on the tyre or on wheel speed sensor data.A difference in wheel speed would trigger an alarm based on the algorithm implemented.In this paper,machine learning approach is proposed as a new method to monitor tyre pressure by extracting the vertical vibrations from a wheel hub of a moving vehicle using an accelerometer.The obtained signals will be used to compute through statistical features and histogram features for the feature extraction process.The LMT(Logistic Model Tree)was used as the classifier and attained a classification accuracy of 92.5%with 10-fold cross validation for statistical features and 90.5% with 10-fold cross validation for histogram features.The proposed model can be used for monitoring the automobile tyre pressure successfully.
基金This work was supported in part by the National Natural Science Foundation of China(61871062,61771082 and 61901071)in part by the Program for Innovation Team Building at Institutions of Higher Education in Chongqing(CXTDX201601020)+1 种基金Science and Technology Research Program of Chongqing Municipal Education Commission(KJQN201800615)General Project of Natural Science Foundation of Chongqing(cstc2019jcyj-msxm1238).
文摘With the rapid development and widespread application of Wireless Body Area Networks(WBANs),the traditional centralized system architecture cannot handle the massive data generated by the edge devices.Meanwhile,in order to ensure the security of physiological privacy data and the identity privacy of patients,this paper presents a privacy protection strategy for Mobile Edge Computing(MEC)enhanced WBANs,which leverages the blockchain-based decentralized MEC paradigm to support efficient transmission of privacy information with low latency,high reliability within a high-demand data security scenario.On this basis,the Merkle tree optimization model is designed to authenticate nodes and to verify the source of physiological data.Furthermore,a hybrid signature algorithm is devised to guarantee the node anonymity with unforgeability,data integrity and reduced delay.The security performance analysis and simulation results show that our proposed strategy not only reduces the delay,but also secures the privacy and transmission of sensitive WBANs data.
基金support from Kongju National University Research Grant (2014)
文摘This study was conducted to evaluate the performance of six stem taper models on four tropical tree species, namely Celtis luzonica(Magabuyo),Diplodiscus paniculatus(Balobo), Parashorea malaanonan(Bagtikan), and Swietenia macrophylla(Mahogany) in Mount Makiling Forest Reserve(MMFR), Philippines using fit statistics and lack-of-fit statistics. Four statistical criteria were used in this study, including the standard error of estimate(SEE),coefficient of determination(R^2), mean bias( E),and absolute mean difference(AMD). For the lack-offit statistics, SEE, E and AMD were determined in different relative height classes. The results indicated that the Kozak02 stem taper model offered the best fit for the four tropical species in most statistics. The Kozak02 model also consistently provided the best performance in the lack-of-fit statistics with the best SEE, E and AMD in most of the relative height classes. These stem taper equations could help forest managers and researchers better estimate the diameter of the outside bark with any given height,merchantable stem volumes and total stem volumes of standing trees belonging to the four species of thetropical forest in MMFR.
基金Supported by The National Key Research and Development Project,No.2020YFC2005900.
文摘BACKGROUND With the aging world population,the incidence of falls has intensified and fallrelated hospitalization costs are increasing.Falls are one type of event studied in the health economics of patient safety,and many developed countries have conducted such research on fall-related hospitalization costs.However,China,a developing country,still lacks large-scale studies in this area.AIM To investigate the factors related to the hospitalization costs of fall-related injuries in elderly inpatients and establish factor-based,cost-related groupings.METHODS A retrospective study was conducted.Patient information and cost data for elderly inpatients(age≥60 years,n=3362)who were hospitalized between 2016 and 2019 due to falls was collected from the medical record systems of two grade-A tertiary hospitals in China.Quantile regression(QR)analysis was used to identify the factors related to fall-related hospitalization costs.A decision tree model based on the chi-squared automatic interaction detector algorithm for hospitalization cost grouping was built by setting the factors in the regression results as separation nodes.RESULTS The total hospitalization cost of fall-related injuries in the included elderly patients was 180479203.03 RMB,and the reimbursement rate of medical benefit funds was 51.0%(92039709.52 RMB/180479203.03 RMB).The medical material costs were the highest component of the total hospitalization cost,followed(in order)by drug costs,test costs,treatment costs,integrated medical service costs and blood transfusion costs The QR results showed that patient age,gender,length of hospital stay,payment method,wound position,wound type,operation times and operation type significantly influenced the inpatient cost(P<0.05).The cost grouping model was established based on the QR results,and age,length of stay,operation type,wound position and wound type were the most important influencing factors in the model.Furthermore,the cost of each combination varied significantly.CONCLUSION Our grouping model of hospitalization costs clearly reflected the key factors affecting hospitalization costs and can be used to strengthen the reasonable control of these costs.
文摘In the design process of berm breakwaters, their front slope recession has an inevitable rule in large number of model tests, and this parameter being studied. This research draws its data from Moghim's and Shekari's experiment results. These experiments consist of two different 2D model tests in two wave flumes, in which the berm recession to different sea state and structural parameters have been studied. Irregular waves with a JONSWAP spectrum were used in both test series. A total of 412 test results were used to cover the impact of sea state conditions such as wave height, wave period, storm duration and water depth at the toe of the structure, and structural parameters such as berm elevation from still water level, berm width and stone diameter on berm recession parameters. In this paper, a new set of equations for berm recession is derived using the M5' model tree as a machine learning approach. A comparison is made between the estimations by the new formula and the formulae recently given by other researchers to show the preference of new M5' approach.
基金Science and Technology Project of Guangzhou, No. 2002Z3-E4016 No. B30101, China
文摘AIM: To construct tree models for classification of diffuse large B-cell lymphomas (DLBCL) by chromosome copy numbers, to compare them with cDNA microarray classification, and to explore models of multi-gene, multi-step and multi-pathway processes of DLBCL tumorigenesis. METHODS: Maximum-weight branching and distancebased models were constructed based on the comparative genomic hybridization (CGH) data of 123 DLBCL samples using the established methods and software of Desper et al . A maximum likelihood tree model was also used to analyze the data. By comparing with the results reported in literature, values of tree models in the classification of DLBCL were elucidated. RESULTS: Both the branching and the distance-based trees classified DLBCL into three groups. We combined the classification methods of the two models and classified DLBCL into three categories according to their characteristics. The first group was marked by +Xq, +Xp, -17p and +13q; the second group by +3q, +18q and +18p; and the third group was marked by -6q and +6p. This chromosomal classification was consistent with cDNA classification. It indicated that -6q and +3q were two main events in the tumorigenesis of lymphoma. CONCLUSION: Tree models of lymphoma established from CGH data can be used in the classification of DLBCL. These models can suggest multi-gene, multistep and multi-pathway processes of tumorigenesis. Two pathways, -6q preceding +6q and +3q preceding+18q, may be important in understanding tumorigenesis of DLBCL. The pathway, -6q preceding +6q, may have a close relationship with the tumorigenesis of non-GCB DLBCL.
基金This work was supported by the National Natural Science Foundation of China(No.60574075) and by Natural Science Foundation of ShaanxiProvince(No.2005A07).
文摘Using Bayesian networks to model promising solutions from the current population of the evolutionary algorithms can ensure efficiency and intelligence search for the optimum. However, to construct a Bayesian network that fits a given dataset is a NP-hard problem, and it also needs consuming mass computational resources. This paper develops a methodology for constructing a graphical model based on Bayesian Dirichlet metric. Our approach is derived from a set of propositions and theorems by researching the local metric relationship of networks matching dataset. This paper presents the algorithm to construct a tree model from a set of potential solutions using above approach. This method is important not only for evolutionary algorithms based on graphical models, but also for machine learning and data mining. The experimental results show that the exact theoretical results and the approximations match very well.