Traditional 3Ni weathering steel cannot completely meet the requirements for offshore engineering development,resulting in the design of novel 3Ni steel with the addition of microalloy elements such as Mn or Nb for st...Traditional 3Ni weathering steel cannot completely meet the requirements for offshore engineering development,resulting in the design of novel 3Ni steel with the addition of microalloy elements such as Mn or Nb for strength enhancement becoming a trend.The stress-assisted corrosion behavior of a novel designed high-strength 3Ni steel was investigated in the current study using the corrosion big data method.The information on the corrosion process was recorded using the galvanic corrosion current monitoring method.The gradi-ent boosting decision tree(GBDT)machine learning method was used to mine the corrosion mechanism,and the importance of the struc-ture factor was investigated.Field exposure tests were conducted to verify the calculated results using the GBDT method.Results indic-ated that the GBDT method can be effectively used to study the influence of structural factors on the corrosion process of 3Ni steel.Dif-ferent mechanisms for the addition of Mn and Cu to the stress-assisted corrosion of 3Ni steel suggested that Mn and Cu have no obvious effect on the corrosion rate of non-stressed 3Ni steel during the early stage of corrosion.When the corrosion reached a stable state,the in-crease in Mn element content increased the corrosion rate of 3Ni steel,while Cu reduced this rate.In the presence of stress,the increase in Mn element content and Cu addition can inhibit the corrosion process.The corrosion law of outdoor-exposed 3Ni steel is consistent with the law based on corrosion big data technology,verifying the reliability of the big data evaluation method and data prediction model selection.展开更多
To enhance the accuracy and efficiency of bridge damage identification,a novel data-driven damage identification method was proposed.First,convolutional autoencoder(CAE)was used to extract key features from the accele...To enhance the accuracy and efficiency of bridge damage identification,a novel data-driven damage identification method was proposed.First,convolutional autoencoder(CAE)was used to extract key features from the acceleration signal of the bridge structure through data reconstruction.The extreme gradient boosting tree(XGBoost)was then used to perform analysis on the feature data to achieve damage detection with high accuracy and high performance.The proposed method was applied in a numerical simulation study on a three-span continuous girder and further validated experimentally on a scaled model of a cable-stayed bridge.The numerical simulation results show that the identification errors remain within 2.9%for six single-damage cases and within 3.1%for four double-damage cases.The experimental validation results demonstrate that when the tension in a single cable of the cable-stayed bridge decreases by 20%,the method accurately identifies damage at different cable locations using only sensors installed on the main girder,achieving identification accuracies above 95.8%in all cases.The proposed method shows high identification accuracy and generalization ability across various damage scenarios.展开更多
BACKGROUND Development of distant metastasis(DM)is a major concern during treatment of nasopharyngeal carcinoma(NPC).However,studies have demonstrated im-proved distant control and survival in patients with advanced N...BACKGROUND Development of distant metastasis(DM)is a major concern during treatment of nasopharyngeal carcinoma(NPC).However,studies have demonstrated im-proved distant control and survival in patients with advanced NPC with the addition of chemotherapy to concomitant chemoradiotherapy.Therefore,precise prediction of metastasis in patients with NPC is crucial.AIM To develop a predictive model for metastasis in NPC using detailed magnetic resonance imaging(MRI)reports.METHODS This retrospective study included 792 patients with non-distant metastatic NPC.A total of 469 imaging variables were obtained from detailed MRI reports.Data were stratified and randomly split into training(50%)and testing sets.Gradient boosting tree(GBT)models were built and used to select variables for predicting DM.A full model comprising all variables and a reduced model with the top-five variables were built.Model performance was assessed by area under the curve(AUC).RESULTS Among the 792 patients,94 developed DM during follow-up.The number of metastatic cervical nodes(30.9%),tumor invasion in the posterior half of the nasal cavity(9.7%),two sides of the pharyngeal recess(6.2%),tubal torus(3.3%),and single side of the parapharyngeal space(2.7%)were the top-five contributors for predicting DM,based on their relative importance in GBT models.The testing AUC of the full model was 0.75(95%confidence interval[CI]:0.69-0.82).The testing AUC of the reduced model was 0.75(95%CI:0.68-0.82).For the whole dataset,the full(AUC=0.76,95%CI:0.72-0.82)and reduced models(AUC=0.76,95%CI:0.71-0.81)outperformed the tumor node-staging system(AUC=0.67,95%CI:0.61-0.73).CONCLUSION The GBT model outperformed the tumor node-staging system in predicting metastasis in NPC.The number of metastatic cervical nodes was identified as the principal contributing variable.展开更多
Accurate prediction ofmonthly oil and gas production is essential for oil enterprises tomake reasonable production plans,avoid blind investment and realize sustainable development.Traditional oil well production trend...Accurate prediction ofmonthly oil and gas production is essential for oil enterprises tomake reasonable production plans,avoid blind investment and realize sustainable development.Traditional oil well production trend prediction methods are based on years of oil field production experience and expertise,and the application conditions are very demanding.With the rapid development of artificial intelligence technology,big data analysis methods are gradually applied in various sub-fields of the oil and gas reservoir development.Based on the data-driven artificial intelligence algorithmGradient BoostingDecision Tree(GBDT),this paper predicts the initial single-layer production by considering geological data,fluid PVT data and well data.The results show that the GBDT algorithm prediction model has great accuracy,significantly improving efficiency and strong universal applicability.The GBDTmethod trained in this paper can predict production,which is helpful for well site optimization,perforation layer optimization and engineering parameter optimization and has guiding significance for oilfield development.展开更多
Protein-protein interactions(PPIs)are of great importance to understand genetic mechanisms,delineate disease pathogenesis,and guide drug design.With the increase of PPI data and development of machine learning technol...Protein-protein interactions(PPIs)are of great importance to understand genetic mechanisms,delineate disease pathogenesis,and guide drug design.With the increase of PPI data and development of machine learning technologies,prediction and identification of PPIs have become a research hotspot in proteomics.In this study,we propose a new prediction pipeline for PPIs based on gradient tree boosting(GTB).First,the initial feature vector is extracted by fusing pseudo amino acid composition(Pse AAC),pseudo position-specific scoring matrix(Pse PSSM),reduced sequence and index-vectors(RSIV),and autocorrelation descriptor(AD).Second,to remove redundancy and noise,we employ L1-regularized logistic regression(L1-RLR)to select an optimal feature subset.Finally,GTB-PPI model is constructed.Five-fold cross-validation showed that GTB-PPI achieved the accuracies of 95.15% and 90.47% on Saccharomyces cerevisiae and Helicobacter pylori datasets,respectively.In addition,GTB-PPI could be applied to predict the independent test datasets for Caenorhabditis elegans,Escherichia coli,Homo sapiens,and Mus musculus,the one-core PPI network for CD9,and the crossover PPI network for the Wnt-related signaling pathways.The results show that GTB-PPI can significantly improve accuracy of PPI prediction.The code and datasets of GTB-PPI can be downloaded from https://github.com/QUST-AIBBDRC/GTB-PPI/.展开更多
To investigate the travel time prediction method of the freeway, a model based on the gradient boosting decision tree (GBDT) is proposed. Eleven variables (namely, travel time in current period T i , traffic flow in c...To investigate the travel time prediction method of the freeway, a model based on the gradient boosting decision tree (GBDT) is proposed. Eleven variables (namely, travel time in current period T i , traffic flow in current period Q i , speed in current period V i , density in current period K i , the number of vehicles in current period N i , occupancy in current period R i , traffic state parameter in current period X i , travel time in previous time period T i -1 , etc.) are selected to predict the travel time for 10 min ahead in the proposed model. Data obtained from VISSIM simulation is used to train and test the model. The results demonstrate that the prediction error of the GBDT model is smaller than those of the back propagation (BP) neural network model and the support vector machine (SVM) model. Travel time in current period T i is the most important variable among all variables in the GBDT model. The GBDT model can produce more accurate prediction results and mine the hidden nonlinear relationships deeply between variables and the predicted travel time.展开更多
This paper aims to design an optimizer followed by a Kawahara filter for optimal classification and prediction of employees’performance.The algorithm starts by processing data by a modified K-means technique as a hie...This paper aims to design an optimizer followed by a Kawahara filter for optimal classification and prediction of employees’performance.The algorithm starts by processing data by a modified K-means technique as a hierarchical clustering method to quickly obtain the best features of employees to reach their best performance.The work of this paper consists of two parts.The first part is based on collecting data of employees to calculate and illustrate the performance of each employee.The second part is based on the classification and prediction techniques of the employee performance.This model is designed to help companies in their decisions about the employees’performance.The classification and prediction algorithms use the Gradient Boosting Tree classifier to classify and predict the features.Results of the paper give the percentage of employees which are expected to leave the company after predicting their performance for the coming years.Results also show that the Grasshopper Optimization,followed by“KF”with the Gradient Boosting Tree as classifier and predictor,is characterized by a high accuracy.The proposed algorithm is compared with other known techniques where our results are fund to be superior.展开更多
Cable-stayed bridges have been widely used in high-speed railway infrastructure.The accurate determination of cable’s representative temperatures is vital during the intricate processes of design,construction,and mai...Cable-stayed bridges have been widely used in high-speed railway infrastructure.The accurate determination of cable’s representative temperatures is vital during the intricate processes of design,construction,and maintenance of cable-stayed bridges.However,the representative temperatures of stayed cables are not specified in the existing design codes.To address this issue,this study investigates the distribution of the cable temperature and determinates its representative temperature.First,an experimental investigation,spanning over a period of one year,was carried out near the bridge site to obtain the temperature data.According to the statistical analysis of the measured data,it reveals that the temperature distribution is generally uniform along the cable cross-section without significant temperature gradient.Then,based on the limited data,the Monte Carlo,the gradient boosted regression trees(GBRT),and univariate linear regression(ULR)methods are employed to predict the cable’s representative temperature throughout the service life.These methods effectively overcome the limitations of insufficient monitoring data and accurately predict the representative temperature of the cables.However,each method has its own advantages and limitations in terms of applicability and accuracy.A comprehensive evaluation of the performance of these methods is conducted,and practical recommendations are provided for their application.The proposed methods and representative temperatures provide a good basis for the operation and maintenance of in-service long-span cable-stayed bridges.展开更多
The stability of underground entry-type excavations will directly affect the working environment and the safety of staff.Empirical critical span graphs and traditional statistics learning methods can not meet the requ...The stability of underground entry-type excavations will directly affect the working environment and the safety of staff.Empirical critical span graphs and traditional statistics learning methods can not meet the requirements of high accuracy for stability assessment of entry-type excavations.Therefore,this study proposes a new prediction method based on machine learning to scientifically adjust the critical span graph.Accordingly,the particle swarm optimization(PSO)algorithm is used to optimize the core parameters of the gradient boosting decision tree(GBDT),abbreviated as PSO-GBDT.Moreover,the classification performance of eight other classifiers including GDBT,k-nearest neighbors(KNN),two kinds of support vector machines(SVM),Gaussian naive Bayes(GNB),logistic regression(LR)and linear discriminant analysis(LDA)are also applied to compare with the proposed model.Findings revealed that compared with the other eight models,the prediction performance of PSO-GBDT is undoubtedly the most reliable,and its classification accuracy is up to 0.93.Therefore,this model has great potential to provide a more scientific and accurate choice for the stability prediction of underground excavations.In addition,each classification model is used to predict the stability category of several grid points divided by the critical span graph,and the updated critical span graph of each model is discussed in combination with previous studies.The results show that the PSO-GBDT model has the advantages of being scientific,accurate and efficient in updating the critical span graph,and its output decision boundary has strict theoretical support,which can help mine operators make favorable economic decisions.展开更多
It is easy for teenagers to view pornographic pictures on social networks. Many researchers have studied the detection of real pornographic pictures, but there are few studies on those that are artificial. In this wor...It is easy for teenagers to view pornographic pictures on social networks. Many researchers have studied the detection of real pornographic pictures, but there are few studies on those that are artificial. In this work, we studied how to detect artificial pornographic pictures, especially when they are on social networks. The whole detection process can be divided into two stages: feature selection and picture detection. In the feature selection stage, seven types of features that favour picture detection were selected. In the picture detection stage, three steps were included. 1) In order to alleviate the imbalance in the number of artificial pornographic pictures and normal ones, the training dataset of artificial pornographic pictures was expanded. Therefore, the features which were extracted from the training dataset can also be expanded too. 2) In order to reduce the time of feature extraction, a fast method which extracted features based on the proportionally scaled picture rather than the original one was proposed. 3) Three tree models were compared and a gradient boost decision tree (GBDT) was selected for the final picture detection. Three sets of experimental results show that the proposed method can achieve better recognition precision and drastically reduce the time cost of the method.展开更多
In order to improve the accuracy of target intent recognition,a recognition method based on XGBoost(eXtreme Gradient Boosting)decision tree is proposed.This paper adopts relevant data and program of python to calculat...In order to improve the accuracy of target intent recognition,a recognition method based on XGBoost(eXtreme Gradient Boosting)decision tree is proposed.This paper adopts relevant data and program of python to calculate the probability of tactical intention.Then the sequence intention probability is obtained by applying Dempster-Shafer rule of combination.To verify the accuracy of recognition results,we compare the experimental results of this paper with the results in the literatures.The experiment shows that the probability of tactical intention recognition through this method is improved,so this method is feasible.展开更多
Epilepsy is a very common worldwide neurological disorder that can affect a person’s quality of life at any age. People with epilepsy typically have recurrent seizures that can lead to injury or in some cases even de...Epilepsy is a very common worldwide neurological disorder that can affect a person’s quality of life at any age. People with epilepsy typically have recurrent seizures that can lead to injury or in some cases even death. Curing epilepsy requires risky surgery. If not, the patient may be subjected to a long drug treatment associated with lifestyle advice without guarantee of total recovery. However, regardless of the type of treatment performed, late treatment necessarily creates psychological instability in the patient. It is therefore important to be able to diagnose the disease as early as possible if we desire that the patient does not suffer from its consequences on their mental health. That is why the study aims to propose a model for detecting epilepsy in order to be able to identify it as early as possible, especially in newborns. The objective of the article is to propose a model for detecting epilepsy using data from electroencephalogram signals from 10 newborns. This model developed using the extra trees classifier technique offers the possibility of predicting epilepsy in infants with an accuracy of around 99.4%.展开更多
This paper presents a hybrid ensemble classifier combined synthetic minority oversampling technique(SMOTE),random search(RS)hyper-parameters optimization algorithm and gradient boosting tree(GBT)to achieve efficient a...This paper presents a hybrid ensemble classifier combined synthetic minority oversampling technique(SMOTE),random search(RS)hyper-parameters optimization algorithm and gradient boosting tree(GBT)to achieve efficient and accurate rock trace identification.A thirteen-dimensional database consisting of basic,vector,and discontinuity features is established from image samples.All data points are classified as either‘‘trace”or‘‘non-trace”to divide the ultimate results into candidate trace samples.It is found that the SMOTE technology can effectively improve classification performance by recommending an optimized imbalance ratio of 1:5 to 1:4.Then,sixteen classifiers generated from four basic machine learning(ML)models are applied for performance comparison.The results reveal that the proposed RS-SMOTE-GBT classifier outperforms the other fifteen hybrid ML algorithms for both trace and nontrace classifications.Finally,discussions on feature importance,generalization ability and classification error are conducted for the proposed classifier.The experimental results indicate that more critical features affecting the trace classification are primarily from the discontinuity features.Besides,cleaning up the sedimentary pumice and reducing the area of fractured rock contribute to improving the overall classification performance.The proposed method provides a new alternative approach for the identification of 3D rock trace.展开更多
In the loose and fractured coal seam with particularly low uniaxial compressive strength(UCS),driving a roadway is extremely difficult as roof falling and wall spalling occur frequently.To address this issue,the jet g...In the loose and fractured coal seam with particularly low uniaxial compressive strength(UCS),driving a roadway is extremely difficult as roof falling and wall spalling occur frequently.To address this issue,the jet grouting(JG)technique(high-pressure grout mixed with coal particles)was first introduced in this study to improve the self-supporting ability of coal mass.To evaluate the strength of the jet-grouted coal-grout composite(JG composite),the UCS evolution patterns were analyzed by preparing 405 specimens combining the influential variables of grout types,curing time,and coal to grout(C/G)ratio.Furthermore,the relationships between UCS and these influencing variables were modeled using ensemble learning methods i.e.gradient boosted regression tree(GBRT)and random forest(RF)with their hyperparameters tuned by the particle swarm optimization(PSO).The results showed that the chemical grout composite has higher short-term strength,while the cement grout composite can achieve more stable strength in the long term.The PSO-GBRT and PSO-RF models can both achieve high prediction accuracy.Also,the variable importance analysis demonstrated that the grout type and curing time should be considered carefully.This study provides a robust intelligent model for predicting UCS of JG composites,which boosts JG design in the field.展开更多
y consumption efficiency and to increase the crop yield.With the increase of agri-cultural data generated by the Internet of Things(IoT),more feasible models are necessary to get full usage of such information.In this...y consumption efficiency and to increase the crop yield.With the increase of agri-cultural data generated by the Internet of Things(IoT),more feasible models are necessary to get full usage of such information.In this research,a Gradient Boost Decision Tree(GBDT)model based on the newly-developed Light Gradient Boosting Machine algorithm(LightGBM or LGBM)was proposed to model the internal temperature of a greenhouse.Fea-tures including climate variables,control variables and additional temporal information collected within five years were used to construct a suitable dataset to train and validate the LGBM model.An adaptive cross-validation method was developed as a novelty to improve the LGBM model performance and self-adaptive ability.For comparison of the pre-dictive accuracy,a Back-Propagation(BP)Neural Network model and a Recurrent Neural Network(RNN)model were built under the same process.Another two GBDT algorithms,Extreme Gradient Boosting(Xgboost)and Stochastic Gradient Boosting(SGB),were also introduced to compare the predictive accuracy with LGBM model.Results suggest that the LGBM has best fitting ability for the temperature curves with RMSE value at 0.645℃,as well as the fastest training speed among all algorithms with 60 times faster than the other two neural network algorithms.The LGBM has strongly potential application pro-spect on both greenhouse environment prediction and real-time predictive control.展开更多
Recommender system is a tool to suggest items to the users from the extensive history of the user’s feedback.Though,it is an emerging research area concerning academics and industries,where it suffers from sparsity,s...Recommender system is a tool to suggest items to the users from the extensive history of the user’s feedback.Though,it is an emerging research area concerning academics and industries,where it suffers from sparsity,scalability,and cold start problems.This paper addresses sparsity,and scalability problems of model-based collaborative recommender system based on ensemble learning approach and enhanced clustering algorithm for movie recommendations.In this paper,an effective movie recommendation system is proposed by Classification and Regression Tree(CART)algorithm,enhanced Balanced Iterative Reducing and Clustering using Hierarchies(BIRCH)algorithm and truncation method.In this research paper,a new hyper parameters tuning is added in BIRCH algorithm to enhance the cluster formation process,where the proposed algorithm is named as enhanced BIRCH.The proposed model yields quality movie recommendation to the new user using Gradient boost classification with broad coverage.In this paper,the proposed model is tested on Movielens dataset,and the performance is evaluated by means of Mean Absolute Error(MAE),precision,recall and f-measure.The experimental results showed the superiority of proposed model in movie recommendation compared to the existing models.The proposed model obtained 0.52 and 0.57 MAE value on Movielens 100k and 1M datasets.Further,the proposed model obtained 0.83 of precision,0.86 of recall and 0.86 of f-measure on Movielens 100k dataset,which are effective compared to the existing models in movie recommendation.展开更多
The agricultural sector’s day-to-day operations,such as irrigation and sowing,are impacted by the weather.Therefore,weather constitutes a key role in all regular human activities.Weather forecasting must be accurate ...The agricultural sector’s day-to-day operations,such as irrigation and sowing,are impacted by the weather.Therefore,weather constitutes a key role in all regular human activities.Weather forecasting must be accurate and precise to plan our activities and safeguard ourselves as well as our property from disasters.Rainfall,wind speed,humidity,wind direction,cloud,temperature,and other weather forecasting variables are used in this work for weather prediction.Many research works have been conducted on weather forecasting.The drawbacks of existing approaches are that they are less effective,inaccurate,and time-consuming.To overcome these issues,this paper proposes an enhanced and reliable weather forecasting technique.As well as developing weather forecasting in remote areas.Weather data analysis and machine learning techniques,such as Gradient Boosting Decision Tree,Random Forest,Naive Bayes Bernoulli,and KNN Algorithm are deployed to anticipate weather conditions.A comparative analysis of result outcome said in determining the number of ensemble methods that may be utilized to improve the accuracy of prediction in weather forecasting.The aim of this study is to demonstrate its ability to predict weather forecasts as soon as possible.Experimental evaluation shows our ensemble technique achieves 95%prediction accuracy.Also,for 1000 nodes it is less than 10 s for prediction,and for 5000 nodes it takes less than 40 s for prediction.展开更多
Aiming at the personalized movie recommendation problem,a recommendation algorithm in-tegrating manifold learning and ensemble learning is studied.In this work,manifold learning is used to reduce the dimension of data...Aiming at the personalized movie recommendation problem,a recommendation algorithm in-tegrating manifold learning and ensemble learning is studied.In this work,manifold learning is used to reduce the dimension of data so that both time and space complexities of the model are mitigated.Meanwhile,gradient boosting decision tree(GBDT)is used to train the target user profile prediction model.Based on the recommendation results,Bayesian optimization algorithm is applied to optimize the recommendation model,which can effectively improve the prediction accuracy.The experimental results show that the proposed algorithm can improve the accuracy of movie recommendation.展开更多
Churn prediction is a common task for machine learning applications in business.In this paper,this task is adapted for solving problem of low efficiency of massive open online courses(only 5%of all the students finish...Churn prediction is a common task for machine learning applications in business.In this paper,this task is adapted for solving problem of low efficiency of massive open online courses(only 5%of all the students finish their course).The approach is presented on course“Methods and algorithms of the graph theory”held on national platform of online education in Russia.This paper includes all the steps to build an intelligent system to predict students who are active during the course,but not likely to finish it.The first part consists of constructing the right sample for prediction,EDA and choosing the most appropriate week of the course to make predictions on.The second part is about choosing the right metric and building models.Also,approach with using ensembles like stacking is proposed to increase the accuracy of predictions.As a result,a general approach to build a churn prediction model for online course is reviewed.This approach can be used for making the process of online education adaptive and intelligent for a separate student.展开更多
红绿灯位置是道路上行人和车辆的交会点,极大影响着道路结构和交通运行,在城市路网中起着重要的枢纽作用。针对目前红绿灯位置检测方法准确率不够高、覆盖面区域不完整等问题,提出了一种基于轨迹数据的交通灯位置检测方法。该方法基于聚...红绿灯位置是道路上行人和车辆的交会点,极大影响着道路结构和交通运行,在城市路网中起着重要的枢纽作用。针对目前红绿灯位置检测方法准确率不够高、覆盖面区域不完整等问题,提出了一种基于轨迹数据的交通灯位置检测方法。该方法基于聚类-合并-分类-合并的四级模型,首先从清理过的轨迹数据中提取隐含的车辆行驶特征,再采用具有噪声的基于密度的聚类(density-based spatial clustering of applications with noise,DBSCAN)方法得到转向和停驻两类聚类中心,对这两类聚类中心进行合并,获得红绿灯位置的候选位置;根据候选位置一定范围内的轨迹点提取该区域的车流行驶特征,然后采用梯度提升决策树(gradient boosting decision tree,GBDT)算法进行分类,最后将候选位置的正样本融合,以检测红绿灯位置。采用成都市浮动车GPS轨迹数据进行实验,检测结果的F1分数为0.947,效果优于常规的机器学习方法。实验结果表明,基于GPS轨迹数据,采用提出的四层模型能有效检测出红绿灯的位置,该模型可被用于城市大范围红绿灯位置信息的快速获取和更新。展开更多
基金supported by the National Nat-ural Science Foundation of China(No.52203376)the National Key Research and Development Program of China(No.2023YFB3813200).
文摘Traditional 3Ni weathering steel cannot completely meet the requirements for offshore engineering development,resulting in the design of novel 3Ni steel with the addition of microalloy elements such as Mn or Nb for strength enhancement becoming a trend.The stress-assisted corrosion behavior of a novel designed high-strength 3Ni steel was investigated in the current study using the corrosion big data method.The information on the corrosion process was recorded using the galvanic corrosion current monitoring method.The gradi-ent boosting decision tree(GBDT)machine learning method was used to mine the corrosion mechanism,and the importance of the struc-ture factor was investigated.Field exposure tests were conducted to verify the calculated results using the GBDT method.Results indic-ated that the GBDT method can be effectively used to study the influence of structural factors on the corrosion process of 3Ni steel.Dif-ferent mechanisms for the addition of Mn and Cu to the stress-assisted corrosion of 3Ni steel suggested that Mn and Cu have no obvious effect on the corrosion rate of non-stressed 3Ni steel during the early stage of corrosion.When the corrosion reached a stable state,the in-crease in Mn element content increased the corrosion rate of 3Ni steel,while Cu reduced this rate.In the presence of stress,the increase in Mn element content and Cu addition can inhibit the corrosion process.The corrosion law of outdoor-exposed 3Ni steel is consistent with the law based on corrosion big data technology,verifying the reliability of the big data evaluation method and data prediction model selection.
基金The National Natural Science Foundation of China(No.52361165658,52378318,52078459).
文摘To enhance the accuracy and efficiency of bridge damage identification,a novel data-driven damage identification method was proposed.First,convolutional autoencoder(CAE)was used to extract key features from the acceleration signal of the bridge structure through data reconstruction.The extreme gradient boosting tree(XGBoost)was then used to perform analysis on the feature data to achieve damage detection with high accuracy and high performance.The proposed method was applied in a numerical simulation study on a three-span continuous girder and further validated experimentally on a scaled model of a cable-stayed bridge.The numerical simulation results show that the identification errors remain within 2.9%for six single-damage cases and within 3.1%for four double-damage cases.The experimental validation results demonstrate that when the tension in a single cable of the cable-stayed bridge decreases by 20%,the method accurately identifies damage at different cable locations using only sensors installed on the main girder,achieving identification accuracies above 95.8%in all cases.The proposed method shows high identification accuracy and generalization ability across various damage scenarios.
文摘BACKGROUND Development of distant metastasis(DM)is a major concern during treatment of nasopharyngeal carcinoma(NPC).However,studies have demonstrated im-proved distant control and survival in patients with advanced NPC with the addition of chemotherapy to concomitant chemoradiotherapy.Therefore,precise prediction of metastasis in patients with NPC is crucial.AIM To develop a predictive model for metastasis in NPC using detailed magnetic resonance imaging(MRI)reports.METHODS This retrospective study included 792 patients with non-distant metastatic NPC.A total of 469 imaging variables were obtained from detailed MRI reports.Data were stratified and randomly split into training(50%)and testing sets.Gradient boosting tree(GBT)models were built and used to select variables for predicting DM.A full model comprising all variables and a reduced model with the top-five variables were built.Model performance was assessed by area under the curve(AUC).RESULTS Among the 792 patients,94 developed DM during follow-up.The number of metastatic cervical nodes(30.9%),tumor invasion in the posterior half of the nasal cavity(9.7%),two sides of the pharyngeal recess(6.2%),tubal torus(3.3%),and single side of the parapharyngeal space(2.7%)were the top-five contributors for predicting DM,based on their relative importance in GBT models.The testing AUC of the full model was 0.75(95%confidence interval[CI]:0.69-0.82).The testing AUC of the reduced model was 0.75(95%CI:0.68-0.82).For the whole dataset,the full(AUC=0.76,95%CI:0.72-0.82)and reduced models(AUC=0.76,95%CI:0.71-0.81)outperformed the tumor node-staging system(AUC=0.67,95%CI:0.61-0.73).CONCLUSION The GBT model outperformed the tumor node-staging system in predicting metastasis in NPC.The number of metastatic cervical nodes was identified as the principal contributing variable.
文摘Accurate prediction ofmonthly oil and gas production is essential for oil enterprises tomake reasonable production plans,avoid blind investment and realize sustainable development.Traditional oil well production trend prediction methods are based on years of oil field production experience and expertise,and the application conditions are very demanding.With the rapid development of artificial intelligence technology,big data analysis methods are gradually applied in various sub-fields of the oil and gas reservoir development.Based on the data-driven artificial intelligence algorithmGradient BoostingDecision Tree(GBDT),this paper predicts the initial single-layer production by considering geological data,fluid PVT data and well data.The results show that the GBDT algorithm prediction model has great accuracy,significantly improving efficiency and strong universal applicability.The GBDTmethod trained in this paper can predict production,which is helpful for well site optimization,perforation layer optimization and engineering parameter optimization and has guiding significance for oilfield development.
基金supported by the National Natural Science Foundation of China(Grant No.61863010)the Key Research and Development Program of Shandong Province of China(Grant No.2019GGX101001)the Natural Science Foundation of Shandong Province of China(Grant No.ZR2018MC007)。
文摘Protein-protein interactions(PPIs)are of great importance to understand genetic mechanisms,delineate disease pathogenesis,and guide drug design.With the increase of PPI data and development of machine learning technologies,prediction and identification of PPIs have become a research hotspot in proteomics.In this study,we propose a new prediction pipeline for PPIs based on gradient tree boosting(GTB).First,the initial feature vector is extracted by fusing pseudo amino acid composition(Pse AAC),pseudo position-specific scoring matrix(Pse PSSM),reduced sequence and index-vectors(RSIV),and autocorrelation descriptor(AD).Second,to remove redundancy and noise,we employ L1-regularized logistic regression(L1-RLR)to select an optimal feature subset.Finally,GTB-PPI model is constructed.Five-fold cross-validation showed that GTB-PPI achieved the accuracies of 95.15% and 90.47% on Saccharomyces cerevisiae and Helicobacter pylori datasets,respectively.In addition,GTB-PPI could be applied to predict the independent test datasets for Caenorhabditis elegans,Escherichia coli,Homo sapiens,and Mus musculus,the one-core PPI network for CD9,and the crossover PPI network for the Wnt-related signaling pathways.The results show that GTB-PPI can significantly improve accuracy of PPI prediction.The code and datasets of GTB-PPI can be downloaded from https://github.com/QUST-AIBBDRC/GTB-PPI/.
基金The National Natural Science Foundation of China(No.51478114,51778136)
文摘To investigate the travel time prediction method of the freeway, a model based on the gradient boosting decision tree (GBDT) is proposed. Eleven variables (namely, travel time in current period T i , traffic flow in current period Q i , speed in current period V i , density in current period K i , the number of vehicles in current period N i , occupancy in current period R i , traffic state parameter in current period X i , travel time in previous time period T i -1 , etc.) are selected to predict the travel time for 10 min ahead in the proposed model. Data obtained from VISSIM simulation is used to train and test the model. The results demonstrate that the prediction error of the GBDT model is smaller than those of the back propagation (BP) neural network model and the support vector machine (SVM) model. Travel time in current period T i is the most important variable among all variables in the GBDT model. The GBDT model can produce more accurate prediction results and mine the hidden nonlinear relationships deeply between variables and the predicted travel time.
文摘This paper aims to design an optimizer followed by a Kawahara filter for optimal classification and prediction of employees’performance.The algorithm starts by processing data by a modified K-means technique as a hierarchical clustering method to quickly obtain the best features of employees to reach their best performance.The work of this paper consists of two parts.The first part is based on collecting data of employees to calculate and illustrate the performance of each employee.The second part is based on the classification and prediction techniques of the employee performance.This model is designed to help companies in their decisions about the employees’performance.The classification and prediction algorithms use the Gradient Boosting Tree classifier to classify and predict the features.Results of the paper give the percentage of employees which are expected to leave the company after predicting their performance for the coming years.Results also show that the Grasshopper Optimization,followed by“KF”with the Gradient Boosting Tree as classifier and predictor,is characterized by a high accuracy.The proposed algorithm is compared with other known techniques where our results are fund to be superior.
基金Project(2017G006-N)supported by the Project of Science and Technology Research and Development Program of China Railway Corporation。
文摘Cable-stayed bridges have been widely used in high-speed railway infrastructure.The accurate determination of cable’s representative temperatures is vital during the intricate processes of design,construction,and maintenance of cable-stayed bridges.However,the representative temperatures of stayed cables are not specified in the existing design codes.To address this issue,this study investigates the distribution of the cable temperature and determinates its representative temperature.First,an experimental investigation,spanning over a period of one year,was carried out near the bridge site to obtain the temperature data.According to the statistical analysis of the measured data,it reveals that the temperature distribution is generally uniform along the cable cross-section without significant temperature gradient.Then,based on the limited data,the Monte Carlo,the gradient boosted regression trees(GBRT),and univariate linear regression(ULR)methods are employed to predict the cable’s representative temperature throughout the service life.These methods effectively overcome the limitations of insufficient monitoring data and accurately predict the representative temperature of the cables.However,each method has its own advantages and limitations in terms of applicability and accuracy.A comprehensive evaluation of the performance of these methods is conducted,and practical recommendations are provided for their application.The proposed methods and representative temperatures provide a good basis for the operation and maintenance of in-service long-span cable-stayed bridges.
基金the National Science Foundation of China(Grant No.42177164)the Distinguished Youth Science Foundation of Hunan Province of China(Grant No.2022JJ10073)the Innovation-Driven Project of Central South University(Grant No.2020CX040).
文摘The stability of underground entry-type excavations will directly affect the working environment and the safety of staff.Empirical critical span graphs and traditional statistics learning methods can not meet the requirements of high accuracy for stability assessment of entry-type excavations.Therefore,this study proposes a new prediction method based on machine learning to scientifically adjust the critical span graph.Accordingly,the particle swarm optimization(PSO)algorithm is used to optimize the core parameters of the gradient boosting decision tree(GBDT),abbreviated as PSO-GBDT.Moreover,the classification performance of eight other classifiers including GDBT,k-nearest neighbors(KNN),two kinds of support vector machines(SVM),Gaussian naive Bayes(GNB),logistic regression(LR)and linear discriminant analysis(LDA)are also applied to compare with the proposed model.Findings revealed that compared with the other eight models,the prediction performance of PSO-GBDT is undoubtedly the most reliable,and its classification accuracy is up to 0.93.Therefore,this model has great potential to provide a more scientific and accurate choice for the stability prediction of underground excavations.In addition,each classification model is used to predict the stability category of several grid points divided by the critical span graph,and the updated critical span graph of each model is discussed in combination with previous studies.The results show that the PSO-GBDT model has the advantages of being scientific,accurate and efficient in updating the critical span graph,and its output decision boundary has strict theoretical support,which can help mine operators make favorable economic decisions.
基金Projects(61573380,61303185) supported by the National Natural Science Foundation of ChinaProjects(2016M592450,2017M612585) supported by the China Postdoctoral Science FoundationProjects(2016JJ4119,2017JJ3416) supported by the Hunan Provincial Natural Science Foundation of China
文摘It is easy for teenagers to view pornographic pictures on social networks. Many researchers have studied the detection of real pornographic pictures, but there are few studies on those that are artificial. In this work, we studied how to detect artificial pornographic pictures, especially when they are on social networks. The whole detection process can be divided into two stages: feature selection and picture detection. In the feature selection stage, seven types of features that favour picture detection were selected. In the picture detection stage, three steps were included. 1) In order to alleviate the imbalance in the number of artificial pornographic pictures and normal ones, the training dataset of artificial pornographic pictures was expanded. Therefore, the features which were extracted from the training dataset can also be expanded too. 2) In order to reduce the time of feature extraction, a fast method which extracted features based on the proportionally scaled picture rather than the original one was proposed. 3) Three tree models were compared and a gradient boost decision tree (GBDT) was selected for the final picture detection. Three sets of experimental results show that the proposed method can achieve better recognition precision and drastically reduce the time cost of the method.
文摘In order to improve the accuracy of target intent recognition,a recognition method based on XGBoost(eXtreme Gradient Boosting)decision tree is proposed.This paper adopts relevant data and program of python to calculate the probability of tactical intention.Then the sequence intention probability is obtained by applying Dempster-Shafer rule of combination.To verify the accuracy of recognition results,we compare the experimental results of this paper with the results in the literatures.The experiment shows that the probability of tactical intention recognition through this method is improved,so this method is feasible.
文摘Epilepsy is a very common worldwide neurological disorder that can affect a person’s quality of life at any age. People with epilepsy typically have recurrent seizures that can lead to injury or in some cases even death. Curing epilepsy requires risky surgery. If not, the patient may be subjected to a long drug treatment associated with lifestyle advice without guarantee of total recovery. However, regardless of the type of treatment performed, late treatment necessarily creates psychological instability in the patient. It is therefore important to be able to diagnose the disease as early as possible if we desire that the patient does not suffer from its consequences on their mental health. That is why the study aims to propose a model for detecting epilepsy in order to be able to identify it as early as possible, especially in newborns. The objective of the article is to propose a model for detecting epilepsy using data from electroencephalogram signals from 10 newborns. This model developed using the extra trees classifier technique offers the possibility of predicting epilepsy in infants with an accuracy of around 99.4%.
基金supported by Key innovation team program of innovation talents promotion plan by MOST of China(No.2016RA4059)Natural Science Foundation Committee Program of China(No.51778474)Science and Technology Project of Yunnan Provincial Transportation Department(No.25 of 2018)。
文摘This paper presents a hybrid ensemble classifier combined synthetic minority oversampling technique(SMOTE),random search(RS)hyper-parameters optimization algorithm and gradient boosting tree(GBT)to achieve efficient and accurate rock trace identification.A thirteen-dimensional database consisting of basic,vector,and discontinuity features is established from image samples.All data points are classified as either‘‘trace”or‘‘non-trace”to divide the ultimate results into candidate trace samples.It is found that the SMOTE technology can effectively improve classification performance by recommending an optimized imbalance ratio of 1:5 to 1:4.Then,sixteen classifiers generated from four basic machine learning(ML)models are applied for performance comparison.The results reveal that the proposed RS-SMOTE-GBT classifier outperforms the other fifteen hybrid ML algorithms for both trace and nontrace classifications.Finally,discussions on feature importance,generalization ability and classification error are conducted for the proposed classifier.The experimental results indicate that more critical features affecting the trace classification are primarily from the discontinuity features.Besides,cleaning up the sedimentary pumice and reducing the area of fractured rock contribute to improving the overall classification performance.The proposed method provides a new alternative approach for the identification of 3D rock trace.
基金financially supported by the Fundamental Research Funds for the Central Universities(2020ZDPY0221)。
文摘In the loose and fractured coal seam with particularly low uniaxial compressive strength(UCS),driving a roadway is extremely difficult as roof falling and wall spalling occur frequently.To address this issue,the jet grouting(JG)technique(high-pressure grout mixed with coal particles)was first introduced in this study to improve the self-supporting ability of coal mass.To evaluate the strength of the jet-grouted coal-grout composite(JG composite),the UCS evolution patterns were analyzed by preparing 405 specimens combining the influential variables of grout types,curing time,and coal to grout(C/G)ratio.Furthermore,the relationships between UCS and these influencing variables were modeled using ensemble learning methods i.e.gradient boosted regression tree(GBRT)and random forest(RF)with their hyperparameters tuned by the particle swarm optimization(PSO).The results showed that the chemical grout composite has higher short-term strength,while the cement grout composite can achieve more stable strength in the long term.The PSO-GBRT and PSO-RF models can both achieve high prediction accuracy.Also,the variable importance analysis demonstrated that the grout type and curing time should be considered carefully.This study provides a robust intelligent model for predicting UCS of JG composites,which boosts JG design in the field.
基金This work was supported in part by Shanghai Agriculture Applied Technology Development Program,China(Grant No.G 2020-02-08-00-07-F01480)Shanghai Municipal Science and Technology Commission Innovation Action Plan(Grant No.17391900900)National Natural Science Foundation of China(Grant No.61573258).
文摘y consumption efficiency and to increase the crop yield.With the increase of agri-cultural data generated by the Internet of Things(IoT),more feasible models are necessary to get full usage of such information.In this research,a Gradient Boost Decision Tree(GBDT)model based on the newly-developed Light Gradient Boosting Machine algorithm(LightGBM or LGBM)was proposed to model the internal temperature of a greenhouse.Fea-tures including climate variables,control variables and additional temporal information collected within five years were used to construct a suitable dataset to train and validate the LGBM model.An adaptive cross-validation method was developed as a novelty to improve the LGBM model performance and self-adaptive ability.For comparison of the pre-dictive accuracy,a Back-Propagation(BP)Neural Network model and a Recurrent Neural Network(RNN)model were built under the same process.Another two GBDT algorithms,Extreme Gradient Boosting(Xgboost)and Stochastic Gradient Boosting(SGB),were also introduced to compare the predictive accuracy with LGBM model.Results suggest that the LGBM has best fitting ability for the temperature curves with RMSE value at 0.645℃,as well as the fastest training speed among all algorithms with 60 times faster than the other two neural network algorithms.The LGBM has strongly potential application pro-spect on both greenhouse environment prediction and real-time predictive control.
文摘Recommender system is a tool to suggest items to the users from the extensive history of the user’s feedback.Though,it is an emerging research area concerning academics and industries,where it suffers from sparsity,scalability,and cold start problems.This paper addresses sparsity,and scalability problems of model-based collaborative recommender system based on ensemble learning approach and enhanced clustering algorithm for movie recommendations.In this paper,an effective movie recommendation system is proposed by Classification and Regression Tree(CART)algorithm,enhanced Balanced Iterative Reducing and Clustering using Hierarchies(BIRCH)algorithm and truncation method.In this research paper,a new hyper parameters tuning is added in BIRCH algorithm to enhance the cluster formation process,where the proposed algorithm is named as enhanced BIRCH.The proposed model yields quality movie recommendation to the new user using Gradient boost classification with broad coverage.In this paper,the proposed model is tested on Movielens dataset,and the performance is evaluated by means of Mean Absolute Error(MAE),precision,recall and f-measure.The experimental results showed the superiority of proposed model in movie recommendation compared to the existing models.The proposed model obtained 0.52 and 0.57 MAE value on Movielens 100k and 1M datasets.Further,the proposed model obtained 0.83 of precision,0.86 of recall and 0.86 of f-measure on Movielens 100k dataset,which are effective compared to the existing models in movie recommendation.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/42/43)Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R135),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The agricultural sector’s day-to-day operations,such as irrigation and sowing,are impacted by the weather.Therefore,weather constitutes a key role in all regular human activities.Weather forecasting must be accurate and precise to plan our activities and safeguard ourselves as well as our property from disasters.Rainfall,wind speed,humidity,wind direction,cloud,temperature,and other weather forecasting variables are used in this work for weather prediction.Many research works have been conducted on weather forecasting.The drawbacks of existing approaches are that they are less effective,inaccurate,and time-consuming.To overcome these issues,this paper proposes an enhanced and reliable weather forecasting technique.As well as developing weather forecasting in remote areas.Weather data analysis and machine learning techniques,such as Gradient Boosting Decision Tree,Random Forest,Naive Bayes Bernoulli,and KNN Algorithm are deployed to anticipate weather conditions.A comparative analysis of result outcome said in determining the number of ensemble methods that may be utilized to improve the accuracy of prediction in weather forecasting.The aim of this study is to demonstrate its ability to predict weather forecasts as soon as possible.Experimental evaluation shows our ensemble technique achieves 95%prediction accuracy.Also,for 1000 nodes it is less than 10 s for prediction,and for 5000 nodes it takes less than 40 s for prediction.
基金Supported by the Educational Commission of Liaoning Province of China(No.LQGD2017027).
文摘Aiming at the personalized movie recommendation problem,a recommendation algorithm in-tegrating manifold learning and ensemble learning is studied.In this work,manifold learning is used to reduce the dimension of data so that both time and space complexities of the model are mitigated.Meanwhile,gradient boosting decision tree(GBDT)is used to train the target user profile prediction model.Based on the recommendation results,Bayesian optimization algorithm is applied to optimize the recommendation model,which can effectively improve the prediction accuracy.The experimental results show that the proposed algorithm can improve the accuracy of movie recommendation.
文摘Churn prediction is a common task for machine learning applications in business.In this paper,this task is adapted for solving problem of low efficiency of massive open online courses(only 5%of all the students finish their course).The approach is presented on course“Methods and algorithms of the graph theory”held on national platform of online education in Russia.This paper includes all the steps to build an intelligent system to predict students who are active during the course,but not likely to finish it.The first part consists of constructing the right sample for prediction,EDA and choosing the most appropriate week of the course to make predictions on.The second part is about choosing the right metric and building models.Also,approach with using ensembles like stacking is proposed to increase the accuracy of predictions.As a result,a general approach to build a churn prediction model for online course is reviewed.This approach can be used for making the process of online education adaptive and intelligent for a separate student.
文摘红绿灯位置是道路上行人和车辆的交会点,极大影响着道路结构和交通运行,在城市路网中起着重要的枢纽作用。针对目前红绿灯位置检测方法准确率不够高、覆盖面区域不完整等问题,提出了一种基于轨迹数据的交通灯位置检测方法。该方法基于聚类-合并-分类-合并的四级模型,首先从清理过的轨迹数据中提取隐含的车辆行驶特征,再采用具有噪声的基于密度的聚类(density-based spatial clustering of applications with noise,DBSCAN)方法得到转向和停驻两类聚类中心,对这两类聚类中心进行合并,获得红绿灯位置的候选位置;根据候选位置一定范围内的轨迹点提取该区域的车流行驶特征,然后采用梯度提升决策树(gradient boosting decision tree,GBDT)算法进行分类,最后将候选位置的正样本融合,以检测红绿灯位置。采用成都市浮动车GPS轨迹数据进行实验,检测结果的F1分数为0.947,效果优于常规的机器学习方法。实验结果表明,基于GPS轨迹数据,采用提出的四层模型能有效检测出红绿灯的位置,该模型可被用于城市大范围红绿灯位置信息的快速获取和更新。