Traditional 3Ni weathering steel cannot completely meet the requirements for offshore engineering development,resulting in the design of novel 3Ni steel with the addition of microalloy elements such as Mn or Nb for st...Traditional 3Ni weathering steel cannot completely meet the requirements for offshore engineering development,resulting in the design of novel 3Ni steel with the addition of microalloy elements such as Mn or Nb for strength enhancement becoming a trend.The stress-assisted corrosion behavior of a novel designed high-strength 3Ni steel was investigated in the current study using the corrosion big data method.The information on the corrosion process was recorded using the galvanic corrosion current monitoring method.The gradi-ent boosting decision tree(GBDT)machine learning method was used to mine the corrosion mechanism,and the importance of the struc-ture factor was investigated.Field exposure tests were conducted to verify the calculated results using the GBDT method.Results indic-ated that the GBDT method can be effectively used to study the influence of structural factors on the corrosion process of 3Ni steel.Dif-ferent mechanisms for the addition of Mn and Cu to the stress-assisted corrosion of 3Ni steel suggested that Mn and Cu have no obvious effect on the corrosion rate of non-stressed 3Ni steel during the early stage of corrosion.When the corrosion reached a stable state,the in-crease in Mn element content increased the corrosion rate of 3Ni steel,while Cu reduced this rate.In the presence of stress,the increase in Mn element content and Cu addition can inhibit the corrosion process.The corrosion law of outdoor-exposed 3Ni steel is consistent with the law based on corrosion big data technology,verifying the reliability of the big data evaluation method and data prediction model selection.展开更多
The existing algorithms for solving multi-objective optimization problems fall into three main categories:Decomposition-based,dominance-based,and indicator-based.Traditional multi-objective optimization problemsmainly...The existing algorithms for solving multi-objective optimization problems fall into three main categories:Decomposition-based,dominance-based,and indicator-based.Traditional multi-objective optimization problemsmainly focus on objectives,treating decision variables as a total variable to solve the problem without consideringthe critical role of decision variables in objective optimization.As seen,a variety of decision variable groupingalgorithms have been proposed.However,these algorithms are relatively broad for the changes of most decisionvariables in the evolution process and are time-consuming in the process of finding the Pareto frontier.To solvethese problems,a multi-objective optimization algorithm for grouping decision variables based on extreme pointPareto frontier(MOEA-DV/EPF)is proposed.This algorithm adopts a preprocessing rule to solve the Paretooptimal solution set of extreme points generated by simultaneous evolution in various target directions,obtainsthe basic Pareto front surface to determine the convergence effect,and analyzes the convergence and distributioneffects of decision variables.In the later stages of algorithm optimization,different mutation strategies are adoptedaccording to the nature of the decision variables to speed up the rate of evolution to obtain excellent individuals,thusenhancing the performance of the algorithm.Evaluation validation of the test functions shows that this algorithmcan solve the multi-objective optimization problem more efficiently.展开更多
The large-scale multi-objective optimization algorithm(LSMOA),based on the grouping of decision variables,is an advanced method for handling high-dimensional decision variables.However,in practical problems,the intera...The large-scale multi-objective optimization algorithm(LSMOA),based on the grouping of decision variables,is an advanced method for handling high-dimensional decision variables.However,in practical problems,the interaction among decision variables is intricate,leading to large group sizes and suboptimal optimization effects;hence a large-scale multi-objective optimization algorithm based on weighted overlapping grouping of decision variables(MOEAWOD)is proposed in this paper.Initially,the decision variables are perturbed and categorized into convergence and diversity variables;subsequently,the convergence variables are subdivided into groups based on the interactions among different decision variables.If the size of a group surpasses the set threshold,that group undergoes a process of weighting and overlapping grouping.Specifically,the interaction strength is evaluated based on the interaction frequency and number of objectives among various decision variables.The decision variable with the highest interaction in the group is identified and disregarded,and the remaining variables are then reclassified into subgroups.Finally,the decision variable with the strongest interaction is added to each subgroup.MOEAWOD minimizes the interactivity between different groups and maximizes the interactivity of decision variables within groups,which contributed to the optimized direction of convergence and diversity exploration with different groups.MOEAWOD was subjected to testing on 18 benchmark large-scale optimization problems,and the experimental results demonstrate the effectiveness of our methods.Compared with the other algorithms,our method is still at an advantage.展开更多
The trend toward designing an intelligent distribution system based on students’individual differences and individual needs has taken precedence in view of the traditional dormitory distribution system,which neglects...The trend toward designing an intelligent distribution system based on students’individual differences and individual needs has taken precedence in view of the traditional dormitory distribution system,which neglects the students’personality traits,causes dormitory disputes,and affects the students’quality of life and academic quality.This paper collects freshmen's data according to college students’personal preferences,conducts a classification comparison,uses the decision tree classification algorithm based on the information gain principle as the core algorithm of dormitory allocation,determines the description rules of students’personal preferences and decision tree classification preferences,completes the conceptual design of the database of entity relations and data dictionaries,meets students’personality classification requirements for the dormitory,and lays the foundation for the intelligent dormitory allocation system.展开更多
Big data is usually unstructured, and many applications require theanalysis in real-time. Decision tree (DT) algorithm is widely used to analyzebig data. Selecting the optimal depth of DT is time-consuming process as ...Big data is usually unstructured, and many applications require theanalysis in real-time. Decision tree (DT) algorithm is widely used to analyzebig data. Selecting the optimal depth of DT is time-consuming process as itrequires many iterations. In this paper, we have designed a modified versionof a (DT). The tree aims to achieve optimal depth by self-tuning runningparameters and improving the accuracy. The efficiency of the modified (DT)was verified using two datasets (airport and fire datasets). The airport datasethas 500000 instances and the fire dataset has 600000 instances. A comparisonhas been made between the modified (DT) and standard (DT) with resultsshowing that the modified performs better. This comparison was conductedon multi-node on Apache Spark tool using Amazon web services. Resultingin accuracy with an increase of 6.85% for the first dataset and 8.85% for theairport dataset. In conclusion, the modified DT showed better accuracy inhandling different-sized datasets compared to standard DT algorithm.展开更多
This study introduces and evaluates a novel artificial hummingbird algorithm-optimised boosted tree(AHAboosted)model for predicting the dynamic modulus(E*)of hot mix asphalt concrete.Using a substantial dataset from N...This study introduces and evaluates a novel artificial hummingbird algorithm-optimised boosted tree(AHAboosted)model for predicting the dynamic modulus(E*)of hot mix asphalt concrete.Using a substantial dataset from NCHRP Report-547,the model was trained and rigorously tested.Performance metrics,specifically RMSE,MAE,and R2,were employed to assess the model's predictive accuracy,robustness,and generalisability.When benchmarked against well-established models like support vector machines(SVM)and gaussian process regression(GPR),the AHA-boosted model demonstrated enhanced performance.It achieved R2 values of 0.997 in training and 0.974 in testing,using the traditional Witczak NCHRP 1-40D model inputs.Incorporating features such as test temperature,frequency,and asphalt content led to a 1.23%increase in the test R2,signifying an improvement in the model's accuracy.The study also explored feature importance and sensitivity through SHAP and permutation importance plots,highlighting binder complex modulus|G*|as a key predictor.Although the AHA-boosted model shows promise,a slight decrease in R2 from training to testing indicates a need for further validation.Overall,this study confirms the AHA-boosted model as a highly accurate and robust tool for predicting the dynamic modulus of hot mix asphalt concrete,making it a valuable asset for pavement engineering.展开更多
Accurate prediction ofmonthly oil and gas production is essential for oil enterprises tomake reasonable production plans,avoid blind investment and realize sustainable development.Traditional oil well production trend...Accurate prediction ofmonthly oil and gas production is essential for oil enterprises tomake reasonable production plans,avoid blind investment and realize sustainable development.Traditional oil well production trend prediction methods are based on years of oil field production experience and expertise,and the application conditions are very demanding.With the rapid development of artificial intelligence technology,big data analysis methods are gradually applied in various sub-fields of the oil and gas reservoir development.Based on the data-driven artificial intelligence algorithmGradient BoostingDecision Tree(GBDT),this paper predicts the initial single-layer production by considering geological data,fluid PVT data and well data.The results show that the GBDT algorithm prediction model has great accuracy,significantly improving efficiency and strong universal applicability.The GBDTmethod trained in this paper can predict production,which is helpful for well site optimization,perforation layer optimization and engineering parameter optimization and has guiding significance for oilfield development.展开更多
Aiming at the problems of multiple types of power quality composite disturbances,strong feature correlation and high recognition error rate,a method of power quality composite disturbances identification based on mult...Aiming at the problems of multiple types of power quality composite disturbances,strong feature correlation and high recognition error rate,a method of power quality composite disturbances identification based on multiresolution S-transform and decision tree was proposed.Firstly,according to IEEE standard,the signal models of seven single power quality disturbances and 17 combined power quality disturbances are given,and the disturbance waveform samples are generated in batches.Then,in order to improve the recognition accuracy,the adjustment factor is introduced to obtain the controllable time-frequency resolution through multi-resolution S-transform time-frequency domain analysis.On this basis,five disturbance time-frequency domain features are extracted,which quantitatively reflect the characteristics of the analyzed power quality disturbance signal,which is less than the traditional method based on S-transform.Finally,three classifiers such as K-nearest neighbor,support vector machine and decision tree algorithm are used to effectively complete the identification of power quality composite disturbances.Simulation results showthat the classification accuracy of decision tree algorithmis higher than that of K-nearest neighbor and support vector machine.Finally,the proposed method is compared with other commonly used recognition algorithms.Experimental results show that the proposedmethod is effective in terms of detection accuracy,especially for combined PQ interference.展开更多
To solve the multi-class fault diagnosis tasks, decision tree support vector machine (DTSVM), which combines SVM and decision tree using the concept of dichotomy, is proposed. Since the classification performance of...To solve the multi-class fault diagnosis tasks, decision tree support vector machine (DTSVM), which combines SVM and decision tree using the concept of dichotomy, is proposed. Since the classification performance of DTSVM highly depends on its structure, to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes, genetic algorithm is introduced into the formation of decision tree, so that the most separable classes would be separated at each node of decisions tree. Numerical simulations conducted on three datasets compared with "one-against-all" and "one-against-one" demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.展开更多
Machine learning algorithms are an important measure with which to perform landslide susceptibility assessments, but most studies use GIS-based classification methods to conduct susceptibility zonation.This study pres...Machine learning algorithms are an important measure with which to perform landslide susceptibility assessments, but most studies use GIS-based classification methods to conduct susceptibility zonation.This study presents a machine learning approach based on the C5.0 decision tree(DT) model and the K-means cluster algorithm to produce a regional landslide susceptibility map. Yanchang County, a typical landslide-prone area located in northwestern China, was taken as the area of interest to introduce the proposed application procedure. A landslide inventory containing 82 landslides was prepared and subsequently randomly partitioned into two subsets: training data(70% landslide pixels) and validation data(30% landslide pixels). Fourteen landslide influencing factors were considered in the input dataset and were used to calculate the landslide occurrence probability based on the C5.0 decision tree model.Susceptibility zonation was implemented according to the cut-off values calculated by the K-means cluster algorithm. The validation results of the model performance analysis showed that the AUC(area under the receiver operating characteristic(ROC) curve) of the proposed model was the highest, reaching 0.88,compared with traditional models(support vector machine(SVM) = 0.85, Bayesian network(BN) = 0.81,frequency ratio(FR) = 0.75, weight of evidence(WOE) = 0.76). The landslide frequency ratio and frequency density of the high susceptibility zones were 6.76/km^(2) and 0.88/km^(2), respectively, which were much higher than those of the low susceptibility zones. The top 20% interval of landslide occurrence probability contained 89% of the historical landslides but only accounted for 10.3% of the total area.Our results indicate that the distribution of high susceptibility zones was more focused without containing more " stable" pixels. Therefore, the obtained susceptibility map is suitable for application to landslide risk management practices.展开更多
In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decisi...In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4.5 and OC1.展开更多
Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the betteri...Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the bettering of ID3 algorithm and constructa data set of the scholarship evaluation system through the analysis of the related attributes in scholarship evaluation information.And also having found some factors that plays a significant role in the growing up of the college students through analysis and re-search of moral education, intellectural education and culture&PE.展开更多
Various process parameters exert different effects in stamping process. In order to study the relationships among the process parameters of box stamping process, including the blank holder force, friction coefficient,...Various process parameters exert different effects in stamping process. In order to study the relationships among the process parameters of box stamping process, including the blank holder force, friction coefficient, depth of drawbead, offset and length of drawbead, the decision tree algorithm C4.5 was performed to generate the decision tree using the result data of the box stamping simulation. The design and improvement methods of the decision tree were presented. Potential and valuable rules were generated by traversing the decision tree, which plays an instructive role on the practical design. The rules show that the correct combination of blank holder force and setting of drawbead are the dominant contribution for controlling the cracking and wrinkling in box stamping process. In order to validate the rules, the stamping process for box was also performed. The experiment results show good agreement with the generated rules.展开更多
As a distributed computing platform, Hadoop provides an effective way to handle big data. In Hadoop, the completion time of job will be delayed by a straggler. Although the definitive cause of the straggler is hard to...As a distributed computing platform, Hadoop provides an effective way to handle big data. In Hadoop, the completion time of job will be delayed by a straggler. Although the definitive cause of the straggler is hard to detect, speculative execution is usually used for dealing with this problem, by simply backing up those stragglers on alternative nodes. In this paper, we design a new Speculative Execution algorithm based on C4.5 Decision Tree, SECDT, for Hadoop. In SECDT, we speculate completion time of stragglers and also of backup tasks, based on a kind of decision tree method: C4.5 decision tree. After we speculate the completion time, we compare the completion time of stragglers and of the backup tasks, calculating their differential value, and selecting the straggler with the maximum differential value to start the backup task.Experiment result shows that the SECDT can predict execution time more accurately than other speculative execution methods, hence reduce the job completion time.展开更多
Recently, researches on distributed data mining by making use of grid are in trend. This paper introduces a data mining algorithm by means of distributed decision-tree,which has taken the advantage of conveniences and...Recently, researches on distributed data mining by making use of grid are in trend. This paper introduces a data mining algorithm by means of distributed decision-tree,which has taken the advantage of conveniences and services supplied by the computing platform-grid,and can perform a data mining of distributed classification on grid.展开更多
Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Crit...Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Critic(SAC)algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process.At the same time,the complexity of the proposed algorithm is calculated,and the stability of the closed-loop system of air combat decision-making controlled by neural network is analysed by the Lyapunov function.This study defines the UAV air combat process as a gaming process and proposes a Parallel Self-Play training SAC algorithm(PSP-SAC)to improve the generalisation performance of UAV control decisions.Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training.展开更多
Chronic hepatitis B and C together with alcoholic and non-alcoholic fatty liver diseases represent the major causes of progressive liver disease that can eventually evolve into cirrhosis and its end-stage complication...Chronic hepatitis B and C together with alcoholic and non-alcoholic fatty liver diseases represent the major causes of progressive liver disease that can eventually evolve into cirrhosis and its end-stage complications,including decompensation,bleeding and liver cancer.Formation and accumulation of fibrosis in the liver is the common pathway that leads to an evolutive liver disease.Precise definition of liver fibrosis stage is essential for management of the patient in clinical practice since the presence of bridging fibrosis represents a strong indication for antiviral therapy for chronic viral hepatitis,while cirrhosis requires a specif ic follow-up including screening for esophageal varices and hepatocellular carcinoma.Liver biopsy has always represented the standard of reference for assessment of hepatic fibrosis but it has some limitations being invasive,costly and prone to sampling errors.Recently,blood markers and instrumental methods have been proposed for the non-invasive assessment of liver fibrosis.However,there are still some doubts as to their implementation in clinical practice and a real consensus on how and when to use them is not still available.This is due to an unsatisfactory accuracy for some of them,and to an incomplete validation for others.Some studies suggest that performance of non-invasive methods for liver fibrosis assessment may increase when they are combined.Combination algorithms of non-invasive methods for assessing liver fibrosis may represent a rational and reliable approach to implement non-invasive assessment of liver fibrosis in clinical practice and to reduce rather than abolish liver biopsies.展开更多
This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting de...This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting decision tree(GBDT), random forest(RF) and information value(InV) models, were used, and the performances were assessed and compared. In total, 202 landslides were mapped by using a series of field surveys, aerial photographs, and reviews of historical and bibliographical data. Nine causative factors were then considered in landslide susceptibility map generation by using the GBDT, RF and InV models. All of the maps of the causative factors were resampled to a resolution of 28.5 m. Of the 486289 pixels in the area,28526 pixels were landslide pixels, and 457763 pixels were non-landslide pixels. Finally, landslide susceptibility maps were generated by using the three machine learning models, and their performances were assessed through receiver operating characteristic(ROC) curves, the sensitivity, specificity,overall accuracy(OA), and kappa coefficient(KAPPA). The results showed that the GBDT, RF and In V models in overall produced reasonable accurate landslide susceptibility maps. Among these three methods, the GBDT method outperforms the other two machine learning methods, which can provide strong technical support for producing landslide susceptibility maps in TGR.展开更多
基金supported by the National Nat-ural Science Foundation of China(No.52203376)the National Key Research and Development Program of China(No.2023YFB3813200).
文摘Traditional 3Ni weathering steel cannot completely meet the requirements for offshore engineering development,resulting in the design of novel 3Ni steel with the addition of microalloy elements such as Mn or Nb for strength enhancement becoming a trend.The stress-assisted corrosion behavior of a novel designed high-strength 3Ni steel was investigated in the current study using the corrosion big data method.The information on the corrosion process was recorded using the galvanic corrosion current monitoring method.The gradi-ent boosting decision tree(GBDT)machine learning method was used to mine the corrosion mechanism,and the importance of the struc-ture factor was investigated.Field exposure tests were conducted to verify the calculated results using the GBDT method.Results indic-ated that the GBDT method can be effectively used to study the influence of structural factors on the corrosion process of 3Ni steel.Dif-ferent mechanisms for the addition of Mn and Cu to the stress-assisted corrosion of 3Ni steel suggested that Mn and Cu have no obvious effect on the corrosion rate of non-stressed 3Ni steel during the early stage of corrosion.When the corrosion reached a stable state,the in-crease in Mn element content increased the corrosion rate of 3Ni steel,while Cu reduced this rate.In the presence of stress,the increase in Mn element content and Cu addition can inhibit the corrosion process.The corrosion law of outdoor-exposed 3Ni steel is consistent with the law based on corrosion big data technology,verifying the reliability of the big data evaluation method and data prediction model selection.
基金the Liaoning Province Nature Fundation Project(2022-MS-291)the National Programme for Foreign Expert Projects(G2022006008L)+2 种基金the Basic Research Projects of Liaoning Provincial Department of Education(LJKMZ20220781,LJKMZ20220783,LJKQZ20222457)King Saud University funded this study through theResearcher Support Program Number(RSPD2023R704)King Saud University,Riyadh,Saudi Arabia.
文摘The existing algorithms for solving multi-objective optimization problems fall into three main categories:Decomposition-based,dominance-based,and indicator-based.Traditional multi-objective optimization problemsmainly focus on objectives,treating decision variables as a total variable to solve the problem without consideringthe critical role of decision variables in objective optimization.As seen,a variety of decision variable groupingalgorithms have been proposed.However,these algorithms are relatively broad for the changes of most decisionvariables in the evolution process and are time-consuming in the process of finding the Pareto frontier.To solvethese problems,a multi-objective optimization algorithm for grouping decision variables based on extreme pointPareto frontier(MOEA-DV/EPF)is proposed.This algorithm adopts a preprocessing rule to solve the Paretooptimal solution set of extreme points generated by simultaneous evolution in various target directions,obtainsthe basic Pareto front surface to determine the convergence effect,and analyzes the convergence and distributioneffects of decision variables.In the later stages of algorithm optimization,different mutation strategies are adoptedaccording to the nature of the decision variables to speed up the rate of evolution to obtain excellent individuals,thusenhancing the performance of the algorithm.Evaluation validation of the test functions shows that this algorithmcan solve the multi-objective optimization problem more efficiently.
基金supported in part by the Central Government Guides Local Science and TechnologyDevelopment Funds(Grant No.YDZJSX2021A038)in part by theNational Natural Science Foundation of China under(Grant No.61806138)in part by the China University Industry-University-Research Collaborative Innovation Fund(Future Network Innovation Research and Application Project)(Grant 2021FNA04014).
文摘The large-scale multi-objective optimization algorithm(LSMOA),based on the grouping of decision variables,is an advanced method for handling high-dimensional decision variables.However,in practical problems,the interaction among decision variables is intricate,leading to large group sizes and suboptimal optimization effects;hence a large-scale multi-objective optimization algorithm based on weighted overlapping grouping of decision variables(MOEAWOD)is proposed in this paper.Initially,the decision variables are perturbed and categorized into convergence and diversity variables;subsequently,the convergence variables are subdivided into groups based on the interactions among different decision variables.If the size of a group surpasses the set threshold,that group undergoes a process of weighting and overlapping grouping.Specifically,the interaction strength is evaluated based on the interaction frequency and number of objectives among various decision variables.The decision variable with the highest interaction in the group is identified and disregarded,and the remaining variables are then reclassified into subgroups.Finally,the decision variable with the strongest interaction is added to each subgroup.MOEAWOD minimizes the interactivity between different groups and maximizes the interactivity of decision variables within groups,which contributed to the optimized direction of convergence and diversity exploration with different groups.MOEAWOD was subjected to testing on 18 benchmark large-scale optimization problems,and the experimental results demonstrate the effectiveness of our methods.Compared with the other algorithms,our method is still at an advantage.
文摘The trend toward designing an intelligent distribution system based on students’individual differences and individual needs has taken precedence in view of the traditional dormitory distribution system,which neglects the students’personality traits,causes dormitory disputes,and affects the students’quality of life and academic quality.This paper collects freshmen's data according to college students’personal preferences,conducts a classification comparison,uses the decision tree classification algorithm based on the information gain principle as the core algorithm of dormitory allocation,determines the description rules of students’personal preferences and decision tree classification preferences,completes the conceptual design of the database of entity relations and data dictionaries,meets students’personality classification requirements for the dormitory,and lays the foundation for the intelligent dormitory allocation system.
文摘Big data is usually unstructured, and many applications require theanalysis in real-time. Decision tree (DT) algorithm is widely used to analyzebig data. Selecting the optimal depth of DT is time-consuming process as itrequires many iterations. In this paper, we have designed a modified versionof a (DT). The tree aims to achieve optimal depth by self-tuning runningparameters and improving the accuracy. The efficiency of the modified (DT)was verified using two datasets (airport and fire datasets). The airport datasethas 500000 instances and the fire dataset has 600000 instances. A comparisonhas been made between the modified (DT) and standard (DT) with resultsshowing that the modified performs better. This comparison was conductedon multi-node on Apache Spark tool using Amazon web services. Resultingin accuracy with an increase of 6.85% for the first dataset and 8.85% for theairport dataset. In conclusion, the modified DT showed better accuracy inhandling different-sized datasets compared to standard DT algorithm.
文摘This study introduces and evaluates a novel artificial hummingbird algorithm-optimised boosted tree(AHAboosted)model for predicting the dynamic modulus(E*)of hot mix asphalt concrete.Using a substantial dataset from NCHRP Report-547,the model was trained and rigorously tested.Performance metrics,specifically RMSE,MAE,and R2,were employed to assess the model's predictive accuracy,robustness,and generalisability.When benchmarked against well-established models like support vector machines(SVM)and gaussian process regression(GPR),the AHA-boosted model demonstrated enhanced performance.It achieved R2 values of 0.997 in training and 0.974 in testing,using the traditional Witczak NCHRP 1-40D model inputs.Incorporating features such as test temperature,frequency,and asphalt content led to a 1.23%increase in the test R2,signifying an improvement in the model's accuracy.The study also explored feature importance and sensitivity through SHAP and permutation importance plots,highlighting binder complex modulus|G*|as a key predictor.Although the AHA-boosted model shows promise,a slight decrease in R2 from training to testing indicates a need for further validation.Overall,this study confirms the AHA-boosted model as a highly accurate and robust tool for predicting the dynamic modulus of hot mix asphalt concrete,making it a valuable asset for pavement engineering.
文摘Accurate prediction ofmonthly oil and gas production is essential for oil enterprises tomake reasonable production plans,avoid blind investment and realize sustainable development.Traditional oil well production trend prediction methods are based on years of oil field production experience and expertise,and the application conditions are very demanding.With the rapid development of artificial intelligence technology,big data analysis methods are gradually applied in various sub-fields of the oil and gas reservoir development.Based on the data-driven artificial intelligence algorithmGradient BoostingDecision Tree(GBDT),this paper predicts the initial single-layer production by considering geological data,fluid PVT data and well data.The results show that the GBDT algorithm prediction model has great accuracy,significantly improving efficiency and strong universal applicability.The GBDTmethod trained in this paper can predict production,which is helpful for well site optimization,perforation layer optimization and engineering parameter optimization and has guiding significance for oilfield development.
基金Foundation of China(No.52067013)the Key Natural Science Fund Project of Gansu Provincial Department of Science and Technology(No.21JR7RA280)+1 种基金the Tianyou Innovation Team Science Foundation of Intelligent Power Supply and State Perception for Rail Transit(No.TY202010)the Natural Science Foundation of Gansu Province(No.20JR5RA395).
文摘Aiming at the problems of multiple types of power quality composite disturbances,strong feature correlation and high recognition error rate,a method of power quality composite disturbances identification based on multiresolution S-transform and decision tree was proposed.Firstly,according to IEEE standard,the signal models of seven single power quality disturbances and 17 combined power quality disturbances are given,and the disturbance waveform samples are generated in batches.Then,in order to improve the recognition accuracy,the adjustment factor is introduced to obtain the controllable time-frequency resolution through multi-resolution S-transform time-frequency domain analysis.On this basis,five disturbance time-frequency domain features are extracted,which quantitatively reflect the characteristics of the analyzed power quality disturbance signal,which is less than the traditional method based on S-transform.Finally,three classifiers such as K-nearest neighbor,support vector machine and decision tree algorithm are used to effectively complete the identification of power quality composite disturbances.Simulation results showthat the classification accuracy of decision tree algorithmis higher than that of K-nearest neighbor and support vector machine.Finally,the proposed method is compared with other commonly used recognition algorithms.Experimental results show that the proposedmethod is effective in terms of detection accuracy,especially for combined PQ interference.
基金supported by the National Natural Science Foundation of China (60604021 60874054)
文摘To solve the multi-class fault diagnosis tasks, decision tree support vector machine (DTSVM), which combines SVM and decision tree using the concept of dichotomy, is proposed. Since the classification performance of DTSVM highly depends on its structure, to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes, genetic algorithm is introduced into the formation of decision tree, so that the most separable classes would be separated at each node of decisions tree. Numerical simulations conducted on three datasets compared with "one-against-all" and "one-against-one" demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.
基金This research is funded by the National Natural Science Foundation of China(Grant Nos.41807285 and 51679117)Key Project of the State Key Laboratory of Geohazard Prevention and Geoenvironment Protection(SKLGP2019Z002)+3 种基金the National Science Foundation of Jiangxi Province,China(20192BAB216034)the China Postdoctoral Science Foundation(2019M652287 and 2020T130274)the Jiangxi Provincial Postdoctoral Science Foundation(2019KY08)Fundamental Research Funds for National Universities,China University of Geosciences(Wuhan)。
文摘Machine learning algorithms are an important measure with which to perform landslide susceptibility assessments, but most studies use GIS-based classification methods to conduct susceptibility zonation.This study presents a machine learning approach based on the C5.0 decision tree(DT) model and the K-means cluster algorithm to produce a regional landslide susceptibility map. Yanchang County, a typical landslide-prone area located in northwestern China, was taken as the area of interest to introduce the proposed application procedure. A landslide inventory containing 82 landslides was prepared and subsequently randomly partitioned into two subsets: training data(70% landslide pixels) and validation data(30% landslide pixels). Fourteen landslide influencing factors were considered in the input dataset and were used to calculate the landslide occurrence probability based on the C5.0 decision tree model.Susceptibility zonation was implemented according to the cut-off values calculated by the K-means cluster algorithm. The validation results of the model performance analysis showed that the AUC(area under the receiver operating characteristic(ROC) curve) of the proposed model was the highest, reaching 0.88,compared with traditional models(support vector machine(SVM) = 0.85, Bayesian network(BN) = 0.81,frequency ratio(FR) = 0.75, weight of evidence(WOE) = 0.76). The landslide frequency ratio and frequency density of the high susceptibility zones were 6.76/km^(2) and 0.88/km^(2), respectively, which were much higher than those of the low susceptibility zones. The top 20% interval of landslide occurrence probability contained 89% of the historical landslides but only accounted for 10.3% of the total area.Our results indicate that the distribution of high susceptibility zones was more focused without containing more " stable" pixels. Therefore, the obtained susceptibility map is suitable for application to landslide risk management practices.
文摘In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4.5 and OC1.
文摘Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the bettering of ID3 algorithm and constructa data set of the scholarship evaluation system through the analysis of the related attributes in scholarship evaluation information.And also having found some factors that plays a significant role in the growing up of the college students through analysis and re-search of moral education, intellectural education and culture&PE.
文摘Various process parameters exert different effects in stamping process. In order to study the relationships among the process parameters of box stamping process, including the blank holder force, friction coefficient, depth of drawbead, offset and length of drawbead, the decision tree algorithm C4.5 was performed to generate the decision tree using the result data of the box stamping simulation. The design and improvement methods of the decision tree were presented. Potential and valuable rules were generated by traversing the decision tree, which plays an instructive role on the practical design. The rules show that the correct combination of blank holder force and setting of drawbead are the dominant contribution for controlling the cracking and wrinkling in box stamping process. In order to validate the rules, the stamping process for box was also performed. The experiment results show good agreement with the generated rules.
文摘As a distributed computing platform, Hadoop provides an effective way to handle big data. In Hadoop, the completion time of job will be delayed by a straggler. Although the definitive cause of the straggler is hard to detect, speculative execution is usually used for dealing with this problem, by simply backing up those stragglers on alternative nodes. In this paper, we design a new Speculative Execution algorithm based on C4.5 Decision Tree, SECDT, for Hadoop. In SECDT, we speculate completion time of stragglers and also of backup tasks, based on a kind of decision tree method: C4.5 decision tree. After we speculate the completion time, we compare the completion time of stragglers and of the backup tasks, calculating their differential value, and selecting the straggler with the maximum differential value to start the backup task.Experiment result shows that the SECDT can predict execution time more accurately than other speculative execution methods, hence reduce the job completion time.
文摘Recently, researches on distributed data mining by making use of grid are in trend. This paper introduces a data mining algorithm by means of distributed decision-tree,which has taken the advantage of conveniences and services supplied by the computing platform-grid,and can perform a data mining of distributed classification on grid.
基金National Natural Science Foundation of China,Grant/Award Number:62003267Fundamental Research Funds for the Central Universities,Grant/Award Number:G2022KY0602+1 种基金Technology on Electromagnetic Space Operations and Applications Laboratory,Grant/Award Number:2022ZX0090Key Core Technology Research Plan of Xi'an,Grant/Award Number:21RGZN0016。
文摘Aiming at addressing the problem of manoeuvring decision-making in UAV air combat,this study establishes a one-to-one air combat model,defines missile attack areas,and uses the non-deterministic policy Soft-Actor-Critic(SAC)algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process.At the same time,the complexity of the proposed algorithm is calculated,and the stability of the closed-loop system of air combat decision-making controlled by neural network is analysed by the Lyapunov function.This study defines the UAV air combat process as a gaming process and proposes a Parallel Self-Play training SAC algorithm(PSP-SAC)to improve the generalisation performance of UAV control decisions.Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training.
基金Supported by An unrestricted grant from Roche-Italia
文摘Chronic hepatitis B and C together with alcoholic and non-alcoholic fatty liver diseases represent the major causes of progressive liver disease that can eventually evolve into cirrhosis and its end-stage complications,including decompensation,bleeding and liver cancer.Formation and accumulation of fibrosis in the liver is the common pathway that leads to an evolutive liver disease.Precise definition of liver fibrosis stage is essential for management of the patient in clinical practice since the presence of bridging fibrosis represents a strong indication for antiviral therapy for chronic viral hepatitis,while cirrhosis requires a specif ic follow-up including screening for esophageal varices and hepatocellular carcinoma.Liver biopsy has always represented the standard of reference for assessment of hepatic fibrosis but it has some limitations being invasive,costly and prone to sampling errors.Recently,blood markers and instrumental methods have been proposed for the non-invasive assessment of liver fibrosis.However,there are still some doubts as to their implementation in clinical practice and a real consensus on how and when to use them is not still available.This is due to an unsatisfactory accuracy for some of them,and to an incomplete validation for others.Some studies suggest that performance of non-invasive methods for liver fibrosis assessment may increase when they are combined.Combination algorithms of non-invasive methods for assessing liver fibrosis may represent a rational and reliable approach to implement non-invasive assessment of liver fibrosis in clinical practice and to reduce rather than abolish liver biopsies.
基金This work was supported in part by the National Natural Science Foundation of China(61601418,41602362,61871259)in part by the Opening Foundation of Hunan Engineering and Research Center of Natural Resource Investigation and Monitoring(2020-5)+1 种基金in part by the Qilian Mountain National Park Research Center(Qinghai)(grant number:GKQ2019-01)in part by the Geomatics Technology and Application Key Laboratory of Qinghai Province,Grant No.QHDX-2019-01.
文摘This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting decision tree(GBDT), random forest(RF) and information value(InV) models, were used, and the performances were assessed and compared. In total, 202 landslides were mapped by using a series of field surveys, aerial photographs, and reviews of historical and bibliographical data. Nine causative factors were then considered in landslide susceptibility map generation by using the GBDT, RF and InV models. All of the maps of the causative factors were resampled to a resolution of 28.5 m. Of the 486289 pixels in the area,28526 pixels were landslide pixels, and 457763 pixels were non-landslide pixels. Finally, landslide susceptibility maps were generated by using the three machine learning models, and their performances were assessed through receiver operating characteristic(ROC) curves, the sensitivity, specificity,overall accuracy(OA), and kappa coefficient(KAPPA). The results showed that the GBDT, RF and In V models in overall produced reasonable accurate landslide susceptibility maps. Among these three methods, the GBDT method outperforms the other two machine learning methods, which can provide strong technical support for producing landslide susceptibility maps in TGR.