The accurate identification of smart meter(SM)fault types is crucial for enhancing the efficiency of operationand maintenance(O&M)and the reliability of power collectionsystems.However,the intelligent classificati...The accurate identification of smart meter(SM)fault types is crucial for enhancing the efficiency of operationand maintenance(O&M)and the reliability of power collectionsystems.However,the intelligent classification of SM fault typesfaces significant challenges owing to the complexity of featuresand the imbalance between fault categories.To address these issues,this study presents a fault diagnosis method for SM incorporatingthree distinct modules.The first module employs acombination of standardization,data imputation,and featureextraction to enhance the data quality,thereby facilitating improvedtraining and learning by the classifiers.To enhance theclassification performance,the data imputation method considersfeature correlation measurement and sequential imputation,and the feature extractor utilizes the discriminative enhancedsparse autoencoder.To tackle the interclass imbalance of datawith discrete and continuous features,the second module introducesan assisted classifier generative adversarial network,which includes a discrete feature generation module.Finally,anovel Stacking ensemble classifier for SM fault diagnosis is developed.In contrast to previous studies,we construct a two-layerheuristic optimization framework to address the synchronousdynamic optimization problem of the combinations and hyperparametersof the Stacking ensemble classifier,enabling betterhandling of complex classification tasks using SM data.The proposedfault diagnosis method for SM via two-layer stacking ensembleoptimization and data augmentation is trained and validatedusing SM fault data collected from 2010 to 2018 in Zhejiang Province,China.Experimental results demonstrate the effectivenessof the proposed method in improving the accuracyof SM fault diagnosis,particularly for minority classes.展开更多
Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while mod...Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while modifying an organization's or user's information.To avoid these security challenges,this article proposes a novel,all-encompassing combination of machine learning(NB,SVM,k-NN)and deep learning(RNN,CNN,LSTM)frameworks for detecting and defending against XSS attacks with high accuracy and efficiency.Based on the representation,a novel idea for merging stacking ensemble with web applications,termed“hybrid stacking”,is proposed.In order to implement the aforementioned methods,four distinct datasets,each of which contains both safe and unsafe content,are considered.The hybrid detection method can adaptively identify the attacks from the URL,and the defense mechanism inherits the advantages of URL encoding with dictionary-based mapping to improve prediction accuracy,accelerate the training process,and effectively remove the unsafe JScript/JavaScript keywords from the URL.The simulation results show that the proposed hybrid model is more efficient than the existing detection methods.It produces more than 99.5%accurate XSS attack classification results(accuracy,precision,recall,f1_score,and Receiver Operating Characteristic(ROC))and is highly resistant to XSS attacks.In order to ensure the security of the server's information,the proposed hybrid approach is demonstrated in a real-time environment.展开更多
Numerical simulation of concrete-faced rockfill dams(CFRDs)considering the spatial variability of rockfill has become a popular research topic in recent years.In order to determine uncertain rockfill properties effici...Numerical simulation of concrete-faced rockfill dams(CFRDs)considering the spatial variability of rockfill has become a popular research topic in recent years.In order to determine uncertain rockfill properties efficiently and reliably,this study developed an uncertainty inversion analysis method for rockfill material parameters using the stacking ensemble strategy and Jaya optimizer.The comprehensive implementation process of the proposed model was described with an illustrative CFRD example.First,the surrogate model method using the stacking ensemble algorithm was used to conduct the Monte Carlo stochastic finite element calculations with reduced computational cost and improved accuracy.Afterwards,the Jaya algorithm was used to inversely calculate the combination of the coefficient of variation of rockfill material parameters.This optimizer obtained higher accuracy and more significant uncertainty reduction than traditional optimizers.Overall,the developed model effectively identified the random parameters of rockfill materials.This study provided scientific references for uncertainty analysis of CFRDs.In addition,the proposed method can be applied to other similar engineering structures.展开更多
As a result of the increased number of COVID-19 cases,Ensemble Machine Learning(EML)would be an effective tool for combatting this pandemic outbreak.An ensemble of classifiers can improve the performance of single mac...As a result of the increased number of COVID-19 cases,Ensemble Machine Learning(EML)would be an effective tool for combatting this pandemic outbreak.An ensemble of classifiers can improve the performance of single machine learning(ML)classifiers,especially stacking-based ensemble learning.Stacking utilizes heterogeneous-base learners trained in parallel and combines their predictions using a meta-model to determine the final prediction results.However,building an ensemble often causes the model performance to decrease due to the increasing number of learners that are not being properly selected.Therefore,the goal of this paper is to develop and evaluate a generic,data-independent predictive method using stacked-based ensemble learning(GA-Stacking)optimized by aGenetic Algorithm(GA)for outbreak prediction and health decision aided processes.GA-Stacking utilizes five well-known classifiers,including Decision Tree(DT),Random Forest(RF),RIGID regression,Least Absolute Shrinkage and Selection Operator(LASSO),and eXtreme Gradient Boosting(XGBoost),at its first level.It also introduces GA to identify comparisons to forecast the number,combination,and trust of these base classifiers based on theMean Squared Error(MSE)as a fitness function.At the second level of the stacked ensemblemodel,a Linear Regression(LR)classifier is used to produce the final prediction.The performance of the model was evaluated using a publicly available dataset from the Center for Systems Science and Engineering,Johns Hopkins University,which consisted of 10,722 data samples.The experimental results indicated that the GA-Stacking model achieved outstanding performance with an overall accuracy of 99.99%for the three selected countries.Furthermore,the proposed model achieved good performance when compared with existing baggingbased approaches.The proposed model can be used to predict the pandemic outbreak correctly and may be applied as a generic data-independent model 3946 CMC,2023,vol.74,no.2 to predict the epidemic trend for other countries when comparing preventive and control measures.展开更多
Recently,machine learning-based technologies have been developed to automate the classification of wafer map defect patterns during semiconductormanufacturing.The existing approaches used in the wafer map pattern clas...Recently,machine learning-based technologies have been developed to automate the classification of wafer map defect patterns during semiconductormanufacturing.The existing approaches used in the wafer map pattern classification include directly learning the image through a convolution neural network and applying the ensemble method after extracting image features.This study aims to classify wafer map defects more effectively and derive robust algorithms even for datasets with insufficient defect patterns.First,the number of defects during the actual process may be limited.Therefore,insufficient data are generated using convolutional auto-encoder(CAE),and the expanded data are verified using the evaluation technique of structural similarity index measure(SSIM).After extracting handcrafted features,a boosted stacking ensemble model that integrates the four base-level classifiers with the extreme gradient boosting classifier as a meta-level classifier is designed and built for training the model based on the expanded data for final prediction.Since the proposed algorithm shows better performance than those of existing ensemble classifiers even for insufficient defect patterns,the results of this study will contribute to improving the product quality and yield of the actual semiconductor manufacturing process.展开更多
BACKGROUND There is a lack of literature discussing the utilization of the stacking ensemble algorithm for predicting depression in patients with heart failure(HF).AIM To create a stacking model for predicting depress...BACKGROUND There is a lack of literature discussing the utilization of the stacking ensemble algorithm for predicting depression in patients with heart failure(HF).AIM To create a stacking model for predicting depression in patients with HF.METHODS This study analyzed data on 1084 HF patients from the National Health and Nutrition Examination Survey database spanning from 2005 to 2018.Through univariate analysis and the use of an artificial neural network algorithm,predictors significantly linked to depression were identified.These predictors were utilized to create a stacking model employing tree-based learners.The performances of both the individual models and the stacking model were assessed by using the test dataset.Furthermore,the SHapley additive exPlanations(SHAP)model was applied to interpret the stacking model.RESULTS The models included five predictors.Among these models,the stacking model demonstrated the highest performance,achieving an area under the curve of 0.77(95%CI:0.71-0.84),a sensitivity of 0.71,and a specificity of 0.68.The calibration curve supported the reliability of the models,and decision curve analysis confirmed their clinical value.The SHAP plot demonstrated that age had the most significant impact on the stacking model's output.CONCLUSION The stacking model demonstrated strong predictive performance.Clinicians can utilize this model to identify highrisk depression patients with HF,thus enabling early provision of psychological interventions.展开更多
Slope failures lead to catastrophic consequences in numerous countries and thus the stability assessment for slopes is of high interest in geotechnical and geological engineering researches.A hybrid stacking ensemble ...Slope failures lead to catastrophic consequences in numerous countries and thus the stability assessment for slopes is of high interest in geotechnical and geological engineering researches.A hybrid stacking ensemble approach is proposed in this study for enhancing the prediction of slope stability.In the hybrid stacking ensemble approach,we used an artificial bee colony(ABC)algorithm to find out the best combination of base classifiers(level 0)and determined a suitable meta-classifier(level 1)from a pool of 11 individual optimized machine learning(OML)algorithms.Finite element analysis(FEA)was conducted in order to form the synthetic database for the training stage(150 cases)of the proposed model while 107 real field slope cases were used for the testing stage.The results by the hybrid stacking ensemble approach were then compared with that obtained by the 11 individual OML methods using confusion matrix,F1-score,and area under the curve,i.e.AUC-score.The comparisons showed that a significant improvement in the prediction ability of slope stability has been achieved by the hybrid stacking ensemble(AUC?90.4%),which is 7%higher than the best of the 11 individual OML methods(AUC?82.9%).Then,a further comparison was undertaken between the hybrid stacking ensemble method and basic ensemble classifier on slope stability prediction.The results showed a prominent performance of the hybrid stacking ensemble method over the basic ensemble method.Finally,the importance of the variables for slope stability was studied using linear vector quantization(LVQ)method.展开更多
Real-time prediction of the rock mass class in front of the tunnel face is essential for the adaptive adjustment of tunnel boring machines(TBMs).During the TBM tunnelling process,a large number of operation data are g...Real-time prediction of the rock mass class in front of the tunnel face is essential for the adaptive adjustment of tunnel boring machines(TBMs).During the TBM tunnelling process,a large number of operation data are generated,reflecting the interaction between the TBM system and surrounding rock,and these data can be used to evaluate the rock mass quality.This study proposed a stacking ensemble classifier for the real-time prediction of the rock mass classification using TBM operation data.Based on the Songhua River water conveyance project,a total of 7538 TBM tunnelling cycles and the corresponding rock mass classes are obtained after data preprocessing.Then,through the tree-based feature selection method,10 key TBM operation parameters are selected,and the mean values of the 10 selected features in the stable phase after removing outliers are calculated as the inputs of classifiers.The preprocessed data are randomly divided into the training set(90%)and test set(10%)using simple random sampling.Besides stacking ensemble classifier,seven individual classifiers are established as the comparison.These classifiers include support vector machine(SVM),k-nearest neighbors(KNN),random forest(RF),gradient boosting decision tree(GBDT),decision tree(DT),logistic regression(LR)and multilayer perceptron(MLP),where the hyper-parameters of each classifier are optimised using the grid search method.The prediction results show that the stacking ensemble classifier has a better performance than individual classifiers,and it shows a more powerful learning and generalisation ability for small and imbalanced samples.Additionally,a relative balance training set is obtained by the synthetic minority oversampling technique(SMOTE),and the influence of sample imbalance on the prediction performance is discussed.展开更多
The exponential growth of Internet and network usage has neces-sitated heightened security measures to protect against data and network breaches.Intrusions,executed through network packets,pose a significant challenge...The exponential growth of Internet and network usage has neces-sitated heightened security measures to protect against data and network breaches.Intrusions,executed through network packets,pose a significant challenge for firewalls to detect and prevent due to the similarity between legit-imate and intrusion traffic.The vast network traffic volume also complicates most network monitoring systems and algorithms.Several intrusion detection methods have been proposed,with machine learning techniques regarded as promising for dealing with these incidents.This study presents an Intrusion Detection System Based on Stacking Ensemble Learning base(Random For-est,Decision Tree,and k-Nearest-Neighbors).The proposed system employs pre-processing techniques to enhance classification efficiency and integrates seven machine learning algorithms.The stacking ensemble technique increases performance by incorporating three base models(Random Forest,Decision Tree,and k-Nearest-Neighbors)and a meta-model represented by the Logistic Regression algorithm.Evaluated using the UNSW-NB15 dataset,the pro-posed IDS gained an accuracy of 96.16%in the training phase and 97.95%in the testing phase,with precision of 97.78%,and 98.40%for taring and testing,respectively.The obtained results demonstrate improvements in other measurement criteria.展开更多
A common difficulty in building prediction models with real-world environmental datasets is the skewed distribution of classes.There are significantly more samples for day-to-day classes,while rare events such as poll...A common difficulty in building prediction models with real-world environmental datasets is the skewed distribution of classes.There are significantly more samples for day-to-day classes,while rare events such as polluted classes are uncommon.Consequently,the limited availability of minority outcomes lowers the classifier’s overall reliability.This study assesses the capability of machine learning(ML)algorithms in tackling imbalanced water quality data based on the metrics of precision,recall,and F1 score.It intends to balance the misled accuracy towards the majority of data.Hence,10 ML algorithms of its performance are compared.The classifiers included are AdaBoost,SupportVector Machine,Linear Discriminant Analysis,k-Nearest Neighbors,Naive Bayes,Decision Trees,Random Forest,Extra Trees,Bagging,and the Multilayer Perceptron.This study also uses the Easy Ensemble Classifier,Balanced Bagging,andRUSBoost algorithm to evaluatemulti-class imbalanced learning methods.The comparison results revealed that a highaccuracy machine learning model is not always good in recall and sensitivity.This paper’s stacked ensemble deep learning(SE-DL)generalization model effectively classifies the water quality index(WQI)based on 23 input variables.The proposed algorithm achieved a remarkable average of 95.69%,94.96%,92.92%,and 93.88%for accuracy,precision,recall,and F1 score,respectively.In addition,the proposed model is compared against two state-of-the-art classifiers,the XGBoost(eXtreme Gradient Boosting)and Light Gradient Boosting Machine,where performance metrics of balanced accuracy and g-mean are included.The experimental setup concluded XGBoost with a higher balanced accuracy and G-mean.However,the SE-DL model has a better and more balanced performance in the F1 score.The SE-DL model aligns with the goal of this study to ensure the balance between accuracy and completeness for each water quality class.The proposed algorithm is also capable of higher efficiency at a lower computational time against using the standard SyntheticMinority Oversampling Technique(SMOTE)approach to imbalanced datasets.展开更多
Flood susceptibility modeling is crucial for rapid flood forecasting, disaster reduction strategies, evacuation planning, and decision-making. Machine learning(ML) models have proven to be effective tools for assessin...Flood susceptibility modeling is crucial for rapid flood forecasting, disaster reduction strategies, evacuation planning, and decision-making. Machine learning(ML) models have proven to be effective tools for assessing flood susceptibility. However, most previous studies have focused on individual models or comparative performance, underscoring the unique strengths and weaknesses of each model. In this study, we propose a stacking ensemble learning algorithm that harnesses the strengths of a diverse range of machine learning models. The findings reveal the following:(1) The stacking ensemble learning, using RF-XGBCB-LR model, significantly enhances flood susceptibility simulation.(2) In addition to rainfall,key flood drivers in the study area include NDVI, and impervious surfaces. Over 40% of the study area, primarily in the northeast and southeast, exhibits high flood susceptibility, with higher risks for populations compared to cropland.(3) In the northeast of the study area,heavy precipitation, low terrain, and NDVI values are key indicators contributing to high flood susceptibility, while long-duration precipitation, mountainous topography, and upper reach vegetation are the main drivers in the southeast. This study underscores the effectiveness of ML, particularly ensemble learning, in flood modeling. It identifies vulnerable areas and contributes to improved flood risk management.展开更多
The stability of underground entry-type excavations(UETEs)is of paramount importance for ensuring the safety of mining operations.As more engineering cases are accumulated,machine learning(ML)has demonstrated great po...The stability of underground entry-type excavations(UETEs)is of paramount importance for ensuring the safety of mining operations.As more engineering cases are accumulated,machine learning(ML)has demonstrated great potential for the stability evaluation of UETEs.In this study,a hybrid stacking ensemble method aggregating support vector machine(SVM),k-nearest neighbor(KNN),decision tree(DT),random forest(RF),multilayer perceptron neural network(MLPNN)and extreme gradient boosting(XGBoost)algorithms was proposed to assess the stability of UETEs.Firstly,a total of 399 historical cases with two indicators were collected from seven mines.Subsequently,to pursue better evaluation performance,the hyperparameters of base learners(SVM,KNN,DT,RF,MLPNN and XGBoost)and meta learner(MLPNN)were tuned by combining a five-fold cross validation(CV)and simulated annealing(SA)approach.Based on the optimal hyperparameters configuration,the stacking ensemble models were constructed using the training set(75%of the data).Finally,the performance of the proposed approach was evaluated by two global metrics(accuracy and Cohen’s Kappa)and three within-class metrics(macro average of the precision,recall and F1-score)on the test set(25%of the data).In addition,the evaluation results were compared with six base learners optimized by SA.The hybrid stacking ensemble algorithm achieved better comprehensive performance with the accuracy,Kappa coefficient,macro average of the precision,recall and F1-score were 0.92,0.851,0.885,0.88 and 0.883,respectively.The rock mass rating(RMR)had the most important influence on evaluation results.Moreover,the critical span graph(CSG)was updated based on the proposed model,representing a significant improvement compared with the previous studies.This study can provide valuable guidance for stability analysis and risk management of UETEs.However,it is necessary to consider more indicators and collect more extensive and balanced dataset to validate the model in future.展开更多
An anomaly-based intrusion detection system(A-IDS)provides a critical aspect in a modern computing infrastructure since new types of attacks can be discovered.It prevalently utilizes several machine learning algorithm...An anomaly-based intrusion detection system(A-IDS)provides a critical aspect in a modern computing infrastructure since new types of attacks can be discovered.It prevalently utilizes several machine learning algorithms(ML)for detecting and classifying network traffic.To date,lots of algorithms have been proposed to improve the detection performance of A-IDS,either using individual or ensemble learners.In particular,ensemble learners have shown remarkable performance over individual learners in many applications,including in cybersecurity domain.However,most existing works still suffer from unsatisfactory results due to improper ensemble design.The aim of this study is to emphasize the effectiveness of stacking ensemble-based model for A-IDS,where deep learning(e.g.,deep neural network[DNN])is used as base learner model.The effectiveness of the proposed model and base DNN model are benchmarked empirically in terms of several performance metrics,i.e.,Matthew’s correlation coefficient,accuracy,and false alarm rate.The results indicate that the proposed model is superior to the base DNN model as well as other existing ML algorithms found in the literature.展开更多
Malicious traffic detection over the internet is one of the challenging areas for researchers to protect network infrastructures from any malicious activity.Several shortcomings of a network system can be leveraged by...Malicious traffic detection over the internet is one of the challenging areas for researchers to protect network infrastructures from any malicious activity.Several shortcomings of a network system can be leveraged by an attacker to get unauthorized access through malicious traffic.Safeguard from such attacks requires an efficient automatic system that can detect malicious traffic timely and avoid system damage.Currently,many automated systems can detect malicious activity,however,the efficacy and accuracy need further improvement to detect malicious traffic from multi-domain systems.The present study focuses on the detection of malicious traffic with high accuracy using machine learning techniques.The proposed approach used two datasets UNSW-NB15 and IoTID20 which contain the data for IoT-based traffic and local network traffic,respectively.Both datasets were combined to increase the capability of the proposed approach in detecting malicious traffic from local and IoT networks,with high accuracy.Horizontally merging both datasets requires an equal number of features which was achieved by reducing feature count to 30 for each dataset by leveraging principal component analysis(PCA).The proposed model incorporates stacked ensemble model extra boosting forest(EBF)which is a combination of tree-based models such as extra tree classifier,gradient boosting classifier,and random forest using a stacked ensemble approach.Empirical results show that EBF performed significantly better and achieved the highest accuracy score of 0.985 and 0.984 on the multi-domain dataset for two and four classes,respectively.展开更多
COVID-19 is a growing problem worldwide with a high mortality rate.As a result,the World Health Organization(WHO)declared it a pandemic.In order to limit the spread of the disease,a fast and accurate diagnosis is requ...COVID-19 is a growing problem worldwide with a high mortality rate.As a result,the World Health Organization(WHO)declared it a pandemic.In order to limit the spread of the disease,a fast and accurate diagnosis is required.A reverse transcript polymerase chain reaction(RT-PCR)test is often used to detect the disease.However,since this test is time-consuming,a chest computed tomography(CT)or plain chest X-ray(CXR)is sometimes indicated.The value of automated diagnosis is that it saves time and money by minimizing human effort.Three significant contributions are made by our research.Its initial purpose is to use the essential finetuning methodology to test the action and efficiency of a variety of vision models,ranging from Inception to Neural Architecture Search(NAS)networks.Second,by plotting class activationmaps(CAMs)for individual networks and assessing classification efficiency with AUC-ROC curves,the behavior of these models is visually analyzed.Finally,stacked ensembles techniques were used to provide greater generalization by combining finetuned models with six ensemble neural networks.Using stacked ensembles,the generalization of the models improved.Furthermore,the ensemble model created by combining all of the finetuned networks obtained a state-of-the-art COVID-19 accuracy detection score of 99.17%.The precision and recall rates were 99.99%and 89.79%,respectively,highlighting the robustness of stacked ensembles.The proposed ensemble approach performed well in the classification of the COVID-19 lesions on CXR according to the experimental results.展开更多
Rockburst is a kind of common geological disaster in deep tunnel engineering.It has the characteristics of causing great harm and occurring at random locations and times.These characteristics seriously affect tunnel c...Rockburst is a kind of common geological disaster in deep tunnel engineering.It has the characteristics of causing great harm and occurring at random locations and times.These characteristics seriously affect tunnel construction and threaten the physical and mental health and safety of workers.Therefore,it is of great significance to study the tendency of rockburst in the early stage of tunnel survey,design and construction.At present,there is no unified method and selected parameters for rockburst prediction.In view of the large difference of different rockburst criteria and the imbalance of rockburst database categories,this paper presents a two-step rockburst prediction method based on multiple factors and the stacking ensemble algorithm.Considering the influence of rock physical and mechanical parameters,tunnel face conditions and excavation disturbance,multiple rockburst criteria are predicted by integrating multiple machine learning algorithms.A combined prediction model of rockburst criteria is established,and the results of each rockburst criterion index are weighted and combined,with the weight updated using the field rockburst record.The dynamic weight is combined with the cloud model to comprehensively evaluate the regional rockburst risk.Field results from applying the model in the Grand Canyon tunnel show that the rockburst prediction method proposed in this paper has better applicability and higher accuracy than the single rockburst criterion.展开更多
Many plant species have a startling degree of morphological similarity,making it difficult to split and categorize them reliably.Unknown plant species can be challenging to classify and segment using deep learning.Whi...Many plant species have a startling degree of morphological similarity,making it difficult to split and categorize them reliably.Unknown plant species can be challenging to classify and segment using deep learning.While using deep learning architectures has helped improve classification accuracy,the resulting models often need to be more flexible and require a large dataset to train.For the sake of taxonomy,this research proposes a hybrid method for categorizing guava,potato,and java plumleaves.Two new approaches are used to formthe hybridmodel suggested here.The guava,potato,and java plum plant species have been successfully segmented using the first model built on the MobileNetV2-UNET architecture.As a second model,we use a Plant Species Detection Stacking Ensemble Deep Learning Model(PSD-SE-DLM)to identify potatoes,java plums,and guava.The proposed models were trained using data collected in Punjab,Pakistan,consisting of images of healthy and sick leaves from guava,java plum,and potatoes.These datasets are known as PLSD and PLSSD.Accuracy levels of 99.84%and 96.38%were achieved for the suggested PSD-SE-DLM and MobileNetV2-UNET models,respectively.展开更多
Cross-Site Scripting(XSS)remains a significant threat to web application security,exploiting vulnerabilities to hijack user sessions and steal sensitive data.Traditional detection methods often fail to keep pace with ...Cross-Site Scripting(XSS)remains a significant threat to web application security,exploiting vulnerabilities to hijack user sessions and steal sensitive data.Traditional detection methods often fail to keep pace with the evolving sophistication of cyber threats.This paper introduces a novel hybrid ensemble learning framework that leverages a combination of advanced machine learning algorithms—Logistic Regression(LR),Support Vector Machines(SVM),eXtreme Gradient Boosting(XGBoost),Categorical Boosting(CatBoost),and Deep Neural Networks(DNN).Utilizing the XSS-Attacks-2021 dataset,which comprises 460 instances across various real-world trafficrelated scenarios,this framework significantly enhances XSS attack detection.Our approach,which includes rigorous feature engineering and model tuning,not only optimizes accuracy but also effectively minimizes false positives(FP)(0.13%)and false negatives(FN)(0.19%).This comprehensive methodology has been rigorously validated,achieving an unprecedented accuracy of 99.87%.The proposed system is scalable and efficient,capable of adapting to the increasing number of web applications and user demands without a decline in performance.It demonstrates exceptional real-time capabilities,with the ability to detect XSS attacks dynamically,maintaining high accuracy and low latency even under significant loads.Furthermore,despite the computational complexity introduced by the hybrid ensemble approach,strategic use of parallel processing and algorithm tuning ensures that the system remains scalable and performs robustly in real-time applications.Designed for easy integration with existing web security systems,our framework supports adaptable Application Programming Interfaces(APIs)and a modular design,facilitating seamless augmentation of current defenses.This innovation represents a significant advancement in cybersecurity,offering a scalable and effective solution for securing modern web applications against evolving threats.展开更多
Due to the availability of a huge number of electronic text documents from a variety of sources representing unstructured and semi-structured information,the document classication task becomes an interesting area for ...Due to the availability of a huge number of electronic text documents from a variety of sources representing unstructured and semi-structured information,the document classication task becomes an interesting area for controlling data behavior.This paper presents a document classication multimodal for categorizing textual semi-structured and unstructured documents.The multimodal implements several individual deep learning models such as Deep Neural Networks(DNN),Recurrent Convolutional Neural Networks(RCNN)and Bidirectional-LSTM(Bi-LSTM).The Stacked Ensemble based meta-model technique is used to combine the results of the individual classiers to produce better results,compared to those reached by any of the above mentioned models individually.A series of textual preprocessing steps are executed to normalize the input corpus followed by text vectorization techniques.These techniques include using Term Frequency Inverse Term Frequency(TFIDF)or Continuous Bag of Word(CBOW)to convert text data into the corresponding suitable numeric form acceptable to be manipulated by deep learning models.Moreover,this proposed model is validated using a dataset collected from several spaces with a huge number of documents in every class.In addition,the experimental results prove that the proposed model has achieved effective performance.Besides,upon investigating the PDF Documents classication,the proposed model has achieved accuracy up to 0.9045 and 0.959 for the TFIDF and CBOW features,respectively.Moreover,concerning the JSON Documents classication,the proposed model has achieved accuracy up to 0.914 and 0.956 for the TFIDF and CBOW features,respectively.Furthermore,as for the XML Documents classication,the proposed model has achieved accuracy values up to 0.92 and 0.959 for the TFIDF and CBOW features,respectively.展开更多
This research work proposes a new stack-based generalization ensemble model to forecast the number of incidences of conjunctivitis disease.In addition to forecasting the occurrences of conjunctivitis incidences,the pr...This research work proposes a new stack-based generalization ensemble model to forecast the number of incidences of conjunctivitis disease.In addition to forecasting the occurrences of conjunctivitis incidences,the proposed model also improves performance by using the ensemble model.Weekly rate of acute Conjunctivitis per 1000 for Hong Kong is collected for the duration of the first week of January 2010 to the last week of December 2019.Pre-processing techniques such as imputation of missing values and logarithmic transformation are applied to pre-process the data sets.A stacked generalization ensemble model based on Auto-ARIMA(Autoregressive Integrated Moving Average),NNAR(Neural Network Autoregression),ETS(Exponential Smoothing),HW(Holt Winter)is proposed and applied on the dataset.Predictive analysis is conducted on the collected dataset of conjunctivitis disease,and further compared for different performance measures.The result shows that the RMSE(Root Mean Square Error),MAE(Mean Absolute Error),MAPE(Mean Absolute Percentage Error),ACF1(Auto Correlation Function)of the proposed ensemble is decreased significantly.Considering the RMSE,for instance,error values are reduced by 39.23%,9.13%,20.42%,and 17.13%in comparison to Auto-ARIMA,NAR,ETS,and HW model respectively.This research concludes that the accuracy of the forecasting of diseases can be significantly increased by applying the proposed stack generalization ensemble model as it minimizes the prediction error and hence provides better prediction trends as compared to Auto-ARIMA,NAR,ETS,and HW model applied discretely.展开更多
基金supported by the National Key R&D Program of China(No.2022YFB2403800)the National Natural Science Foundation of China(No.52277118)+1 种基金the Natural Science Foundation of Tianjin(No.22JCZDJC00660)the Open Fund in the State Key Laboratory of Alternate Electrical Power System With Renewable Energy Sources(No.LAPS23018).
文摘The accurate identification of smart meter(SM)fault types is crucial for enhancing the efficiency of operationand maintenance(O&M)and the reliability of power collectionsystems.However,the intelligent classification of SM fault typesfaces significant challenges owing to the complexity of featuresand the imbalance between fault categories.To address these issues,this study presents a fault diagnosis method for SM incorporatingthree distinct modules.The first module employs acombination of standardization,data imputation,and featureextraction to enhance the data quality,thereby facilitating improvedtraining and learning by the classifiers.To enhance theclassification performance,the data imputation method considersfeature correlation measurement and sequential imputation,and the feature extractor utilizes the discriminative enhancedsparse autoencoder.To tackle the interclass imbalance of datawith discrete and continuous features,the second module introducesan assisted classifier generative adversarial network,which includes a discrete feature generation module.Finally,anovel Stacking ensemble classifier for SM fault diagnosis is developed.In contrast to previous studies,we construct a two-layerheuristic optimization framework to address the synchronousdynamic optimization problem of the combinations and hyperparametersof the Stacking ensemble classifier,enabling betterhandling of complex classification tasks using SM data.The proposedfault diagnosis method for SM via two-layer stacking ensembleoptimization and data augmentation is trained and validatedusing SM fault data collected from 2010 to 2018 in Zhejiang Province,China.Experimental results demonstrate the effectivenessof the proposed method in improving the accuracyof SM fault diagnosis,particularly for minority classes.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MEST)No.2015R1A3A2031159,2016R1A5A1008055.
文摘Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while modifying an organization's or user's information.To avoid these security challenges,this article proposes a novel,all-encompassing combination of machine learning(NB,SVM,k-NN)and deep learning(RNN,CNN,LSTM)frameworks for detecting and defending against XSS attacks with high accuracy and efficiency.Based on the representation,a novel idea for merging stacking ensemble with web applications,termed“hybrid stacking”,is proposed.In order to implement the aforementioned methods,four distinct datasets,each of which contains both safe and unsafe content,are considered.The hybrid detection method can adaptively identify the attacks from the URL,and the defense mechanism inherits the advantages of URL encoding with dictionary-based mapping to improve prediction accuracy,accelerate the training process,and effectively remove the unsafe JScript/JavaScript keywords from the URL.The simulation results show that the proposed hybrid model is more efficient than the existing detection methods.It produces more than 99.5%accurate XSS attack classification results(accuracy,precision,recall,f1_score,and Receiver Operating Characteristic(ROC))and is highly resistant to XSS attacks.In order to ensure the security of the server's information,the proposed hybrid approach is demonstrated in a real-time environment.
基金supported by the National Natural Science Foundation of China(Grants No.51879185 and 52179139)the Open Fund of the Hubei Key Laboratory of Construction and Management in Hydropower Engineering(Grant No.2020KSD06).
文摘Numerical simulation of concrete-faced rockfill dams(CFRDs)considering the spatial variability of rockfill has become a popular research topic in recent years.In order to determine uncertain rockfill properties efficiently and reliably,this study developed an uncertainty inversion analysis method for rockfill material parameters using the stacking ensemble strategy and Jaya optimizer.The comprehensive implementation process of the proposed model was described with an illustrative CFRD example.First,the surrogate model method using the stacking ensemble algorithm was used to conduct the Monte Carlo stochastic finite element calculations with reduced computational cost and improved accuracy.Afterwards,the Jaya algorithm was used to inversely calculate the combination of the coefficient of variation of rockfill material parameters.This optimizer obtained higher accuracy and more significant uncertainty reduction than traditional optimizers.Overall,the developed model effectively identified the random parameters of rockfill materials.This study provided scientific references for uncertainty analysis of CFRDs.In addition,the proposed method can be applied to other similar engineering structures.
文摘As a result of the increased number of COVID-19 cases,Ensemble Machine Learning(EML)would be an effective tool for combatting this pandemic outbreak.An ensemble of classifiers can improve the performance of single machine learning(ML)classifiers,especially stacking-based ensemble learning.Stacking utilizes heterogeneous-base learners trained in parallel and combines their predictions using a meta-model to determine the final prediction results.However,building an ensemble often causes the model performance to decrease due to the increasing number of learners that are not being properly selected.Therefore,the goal of this paper is to develop and evaluate a generic,data-independent predictive method using stacked-based ensemble learning(GA-Stacking)optimized by aGenetic Algorithm(GA)for outbreak prediction and health decision aided processes.GA-Stacking utilizes five well-known classifiers,including Decision Tree(DT),Random Forest(RF),RIGID regression,Least Absolute Shrinkage and Selection Operator(LASSO),and eXtreme Gradient Boosting(XGBoost),at its first level.It also introduces GA to identify comparisons to forecast the number,combination,and trust of these base classifiers based on theMean Squared Error(MSE)as a fitness function.At the second level of the stacked ensemblemodel,a Linear Regression(LR)classifier is used to produce the final prediction.The performance of the model was evaluated using a publicly available dataset from the Center for Systems Science and Engineering,Johns Hopkins University,which consisted of 10,722 data samples.The experimental results indicated that the GA-Stacking model achieved outstanding performance with an overall accuracy of 99.99%for the three selected countries.Furthermore,the proposed model achieved good performance when compared with existing baggingbased approaches.The proposed model can be used to predict the pandemic outbreak correctly and may be applied as a generic data-independent model 3946 CMC,2023,vol.74,no.2 to predict the epidemic trend for other countries when comparing preventive and control measures.
基金the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.NRF-2021R1A5A8033165)the“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)and was granted financial resources from the Ministry of Trade,Industry&Energy,Republic of Korea(No.20214000000200).
文摘Recently,machine learning-based technologies have been developed to automate the classification of wafer map defect patterns during semiconductormanufacturing.The existing approaches used in the wafer map pattern classification include directly learning the image through a convolution neural network and applying the ensemble method after extracting image features.This study aims to classify wafer map defects more effectively and derive robust algorithms even for datasets with insufficient defect patterns.First,the number of defects during the actual process may be limited.Therefore,insufficient data are generated using convolutional auto-encoder(CAE),and the expanded data are verified using the evaluation technique of structural similarity index measure(SSIM).After extracting handcrafted features,a boosted stacking ensemble model that integrates the four base-level classifiers with the extreme gradient boosting classifier as a meta-level classifier is designed and built for training the model based on the expanded data for final prediction.Since the proposed algorithm shows better performance than those of existing ensemble classifiers even for insufficient defect patterns,the results of this study will contribute to improving the product quality and yield of the actual semiconductor manufacturing process.
文摘BACKGROUND There is a lack of literature discussing the utilization of the stacking ensemble algorithm for predicting depression in patients with heart failure(HF).AIM To create a stacking model for predicting depression in patients with HF.METHODS This study analyzed data on 1084 HF patients from the National Health and Nutrition Examination Survey database spanning from 2005 to 2018.Through univariate analysis and the use of an artificial neural network algorithm,predictors significantly linked to depression were identified.These predictors were utilized to create a stacking model employing tree-based learners.The performances of both the individual models and the stacking model were assessed by using the test dataset.Furthermore,the SHapley additive exPlanations(SHAP)model was applied to interpret the stacking model.RESULTS The models included five predictors.Among these models,the stacking model demonstrated the highest performance,achieving an area under the curve of 0.77(95%CI:0.71-0.84),a sensitivity of 0.71,and a specificity of 0.68.The calibration curve supported the reliability of the models,and decision curve analysis confirmed their clinical value.The SHAP plot demonstrated that age had the most significant impact on the stacking model's output.CONCLUSION The stacking model demonstrated strong predictive performance.Clinicians can utilize this model to identify highrisk depression patients with HF,thus enabling early provision of psychological interventions.
基金We acknowledge the funding support from Australia Research Council(Grant Nos.DP200100549 and IH180100010).
文摘Slope failures lead to catastrophic consequences in numerous countries and thus the stability assessment for slopes is of high interest in geotechnical and geological engineering researches.A hybrid stacking ensemble approach is proposed in this study for enhancing the prediction of slope stability.In the hybrid stacking ensemble approach,we used an artificial bee colony(ABC)algorithm to find out the best combination of base classifiers(level 0)and determined a suitable meta-classifier(level 1)from a pool of 11 individual optimized machine learning(OML)algorithms.Finite element analysis(FEA)was conducted in order to form the synthetic database for the training stage(150 cases)of the proposed model while 107 real field slope cases were used for the testing stage.The results by the hybrid stacking ensemble approach were then compared with that obtained by the 11 individual OML methods using confusion matrix,F1-score,and area under the curve,i.e.AUC-score.The comparisons showed that a significant improvement in the prediction ability of slope stability has been achieved by the hybrid stacking ensemble(AUC?90.4%),which is 7%higher than the best of the 11 individual OML methods(AUC?82.9%).Then,a further comparison was undertaken between the hybrid stacking ensemble method and basic ensemble classifier on slope stability prediction.The results showed a prominent performance of the hybrid stacking ensemble method over the basic ensemble method.Finally,the importance of the variables for slope stability was studied using linear vector quantization(LVQ)method.
基金funded by the National Natural Science Foundation of China(Grant No.41941019)the State Key Laboratory of Hydroscience and Engineering(Grant No.2019-KY-03)。
文摘Real-time prediction of the rock mass class in front of the tunnel face is essential for the adaptive adjustment of tunnel boring machines(TBMs).During the TBM tunnelling process,a large number of operation data are generated,reflecting the interaction between the TBM system and surrounding rock,and these data can be used to evaluate the rock mass quality.This study proposed a stacking ensemble classifier for the real-time prediction of the rock mass classification using TBM operation data.Based on the Songhua River water conveyance project,a total of 7538 TBM tunnelling cycles and the corresponding rock mass classes are obtained after data preprocessing.Then,through the tree-based feature selection method,10 key TBM operation parameters are selected,and the mean values of the 10 selected features in the stable phase after removing outliers are calculated as the inputs of classifiers.The preprocessed data are randomly divided into the training set(90%)and test set(10%)using simple random sampling.Besides stacking ensemble classifier,seven individual classifiers are established as the comparison.These classifiers include support vector machine(SVM),k-nearest neighbors(KNN),random forest(RF),gradient boosting decision tree(GBDT),decision tree(DT),logistic regression(LR)and multilayer perceptron(MLP),where the hyper-parameters of each classifier are optimised using the grid search method.The prediction results show that the stacking ensemble classifier has a better performance than individual classifiers,and it shows a more powerful learning and generalisation ability for small and imbalanced samples.Additionally,a relative balance training set is obtained by the synthetic minority oversampling technique(SMOTE),and the influence of sample imbalance on the prediction performance is discussed.
文摘The exponential growth of Internet and network usage has neces-sitated heightened security measures to protect against data and network breaches.Intrusions,executed through network packets,pose a significant challenge for firewalls to detect and prevent due to the similarity between legit-imate and intrusion traffic.The vast network traffic volume also complicates most network monitoring systems and algorithms.Several intrusion detection methods have been proposed,with machine learning techniques regarded as promising for dealing with these incidents.This study presents an Intrusion Detection System Based on Stacking Ensemble Learning base(Random For-est,Decision Tree,and k-Nearest-Neighbors).The proposed system employs pre-processing techniques to enhance classification efficiency and integrates seven machine learning algorithms.The stacking ensemble technique increases performance by incorporating three base models(Random Forest,Decision Tree,and k-Nearest-Neighbors)and a meta-model represented by the Logistic Regression algorithm.Evaluated using the UNSW-NB15 dataset,the pro-posed IDS gained an accuracy of 96.16%in the training phase and 97.95%in the testing phase,with precision of 97.78%,and 98.40%for taring and testing,respectively.The obtained results demonstrate improvements in other measurement criteria.
基金primarily supported by the Ministry of Higher Education through MRUN Young Researchers Grant Scheme(MY-RGS),MR001-2019,entitled“Climate Change Mitigation:Artificial Intelligence-Based Integrated Environmental System for Mangrove Forest Conservation,”received by K.H.,S.A.R.,H.F.H.,M.I.M.,and M.M.Asecondarily funded by the UM-RU Grant,ST065-2021,entitled Climate Smart Mitigation and Adaptation:Integrated Climate Resilience Strategy for Tropical Marine Ecosystem.
文摘A common difficulty in building prediction models with real-world environmental datasets is the skewed distribution of classes.There are significantly more samples for day-to-day classes,while rare events such as polluted classes are uncommon.Consequently,the limited availability of minority outcomes lowers the classifier’s overall reliability.This study assesses the capability of machine learning(ML)algorithms in tackling imbalanced water quality data based on the metrics of precision,recall,and F1 score.It intends to balance the misled accuracy towards the majority of data.Hence,10 ML algorithms of its performance are compared.The classifiers included are AdaBoost,SupportVector Machine,Linear Discriminant Analysis,k-Nearest Neighbors,Naive Bayes,Decision Trees,Random Forest,Extra Trees,Bagging,and the Multilayer Perceptron.This study also uses the Easy Ensemble Classifier,Balanced Bagging,andRUSBoost algorithm to evaluatemulti-class imbalanced learning methods.The comparison results revealed that a highaccuracy machine learning model is not always good in recall and sensitivity.This paper’s stacked ensemble deep learning(SE-DL)generalization model effectively classifies the water quality index(WQI)based on 23 input variables.The proposed algorithm achieved a remarkable average of 95.69%,94.96%,92.92%,and 93.88%for accuracy,precision,recall,and F1 score,respectively.In addition,the proposed model is compared against two state-of-the-art classifiers,the XGBoost(eXtreme Gradient Boosting)and Light Gradient Boosting Machine,where performance metrics of balanced accuracy and g-mean are included.The experimental setup concluded XGBoost with a higher balanced accuracy and G-mean.However,the SE-DL model has a better and more balanced performance in the F1 score.The SE-DL model aligns with the goal of this study to ensure the balance between accuracy and completeness for each water quality class.The proposed algorithm is also capable of higher efficiency at a lower computational time against using the standard SyntheticMinority Oversampling Technique(SMOTE)approach to imbalanced datasets.
基金National Natural Science Foundation of China,No.42271037Key Research and Development Program Project of Anhui Province,No.2022m07020011+1 种基金The University Synergy Innovation Program of Anhui Province,No.GXXT-2021-048Science Foundation for Excellent Young Scholars of Anhui,No.2108085Y13。
文摘Flood susceptibility modeling is crucial for rapid flood forecasting, disaster reduction strategies, evacuation planning, and decision-making. Machine learning(ML) models have proven to be effective tools for assessing flood susceptibility. However, most previous studies have focused on individual models or comparative performance, underscoring the unique strengths and weaknesses of each model. In this study, we propose a stacking ensemble learning algorithm that harnesses the strengths of a diverse range of machine learning models. The findings reveal the following:(1) The stacking ensemble learning, using RF-XGBCB-LR model, significantly enhances flood susceptibility simulation.(2) In addition to rainfall,key flood drivers in the study area include NDVI, and impervious surfaces. Over 40% of the study area, primarily in the northeast and southeast, exhibits high flood susceptibility, with higher risks for populations compared to cropland.(3) In the northeast of the study area,heavy precipitation, low terrain, and NDVI values are key indicators contributing to high flood susceptibility, while long-duration precipitation, mountainous topography, and upper reach vegetation are the main drivers in the southeast. This study underscores the effectiveness of ML, particularly ensemble learning, in flood modeling. It identifies vulnerable areas and contributes to improved flood risk management.
基金supported by the National Natural Science Foundation of China(Grant No.52204117)the Natural Science Foundation of Hunan Province,China(Grant No.2022JJ40601).
文摘The stability of underground entry-type excavations(UETEs)is of paramount importance for ensuring the safety of mining operations.As more engineering cases are accumulated,machine learning(ML)has demonstrated great potential for the stability evaluation of UETEs.In this study,a hybrid stacking ensemble method aggregating support vector machine(SVM),k-nearest neighbor(KNN),decision tree(DT),random forest(RF),multilayer perceptron neural network(MLPNN)and extreme gradient boosting(XGBoost)algorithms was proposed to assess the stability of UETEs.Firstly,a total of 399 historical cases with two indicators were collected from seven mines.Subsequently,to pursue better evaluation performance,the hyperparameters of base learners(SVM,KNN,DT,RF,MLPNN and XGBoost)and meta learner(MLPNN)were tuned by combining a five-fold cross validation(CV)and simulated annealing(SA)approach.Based on the optimal hyperparameters configuration,the stacking ensemble models were constructed using the training set(75%of the data).Finally,the performance of the proposed approach was evaluated by two global metrics(accuracy and Cohen’s Kappa)and three within-class metrics(macro average of the precision,recall and F1-score)on the test set(25%of the data).In addition,the evaluation results were compared with six base learners optimized by SA.The hybrid stacking ensemble algorithm achieved better comprehensive performance with the accuracy,Kappa coefficient,macro average of the precision,recall and F1-score were 0.92,0.851,0.885,0.88 and 0.883,respectively.The rock mass rating(RMR)had the most important influence on evaluation results.Moreover,the critical span graph(CSG)was updated based on the proposed model,representing a significant improvement compared with the previous studies.This study can provide valuable guidance for stability analysis and risk management of UETEs.However,it is necessary to consider more indicators and collect more extensive and balanced dataset to validate the model in future.
基金the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2019R1F1A1059346)This work was supported by the 2020 Research Fund(Project No.1.180090.01)of UNIST(Ulsan National Institute of Science and Technology).
文摘An anomaly-based intrusion detection system(A-IDS)provides a critical aspect in a modern computing infrastructure since new types of attacks can be discovered.It prevalently utilizes several machine learning algorithms(ML)for detecting and classifying network traffic.To date,lots of algorithms have been proposed to improve the detection performance of A-IDS,either using individual or ensemble learners.In particular,ensemble learners have shown remarkable performance over individual learners in many applications,including in cybersecurity domain.However,most existing works still suffer from unsatisfactory results due to improper ensemble design.The aim of this study is to emphasize the effectiveness of stacking ensemble-based model for A-IDS,where deep learning(e.g.,deep neural network[DNN])is used as base learner model.The effectiveness of the proposed model and base DNN model are benchmarked empirically in terms of several performance metrics,i.e.,Matthew’s correlation coefficient,accuracy,and false alarm rate.The results indicate that the proposed model is superior to the base DNN model as well as other existing ML algorithms found in the literature.
文摘Malicious traffic detection over the internet is one of the challenging areas for researchers to protect network infrastructures from any malicious activity.Several shortcomings of a network system can be leveraged by an attacker to get unauthorized access through malicious traffic.Safeguard from such attacks requires an efficient automatic system that can detect malicious traffic timely and avoid system damage.Currently,many automated systems can detect malicious activity,however,the efficacy and accuracy need further improvement to detect malicious traffic from multi-domain systems.The present study focuses on the detection of malicious traffic with high accuracy using machine learning techniques.The proposed approach used two datasets UNSW-NB15 and IoTID20 which contain the data for IoT-based traffic and local network traffic,respectively.Both datasets were combined to increase the capability of the proposed approach in detecting malicious traffic from local and IoT networks,with high accuracy.Horizontally merging both datasets requires an equal number of features which was achieved by reducing feature count to 30 for each dataset by leveraging principal component analysis(PCA).The proposed model incorporates stacked ensemble model extra boosting forest(EBF)which is a combination of tree-based models such as extra tree classifier,gradient boosting classifier,and random forest using a stacked ensemble approach.Empirical results show that EBF performed significantly better and achieved the highest accuracy score of 0.985 and 0.984 on the multi-domain dataset for two and four classes,respectively.
基金The research is funded by the Researchers Supporting Project at King Saud University,(Project#RSP-2021/305).
文摘COVID-19 is a growing problem worldwide with a high mortality rate.As a result,the World Health Organization(WHO)declared it a pandemic.In order to limit the spread of the disease,a fast and accurate diagnosis is required.A reverse transcript polymerase chain reaction(RT-PCR)test is often used to detect the disease.However,since this test is time-consuming,a chest computed tomography(CT)or plain chest X-ray(CXR)is sometimes indicated.The value of automated diagnosis is that it saves time and money by minimizing human effort.Three significant contributions are made by our research.Its initial purpose is to use the essential finetuning methodology to test the action and efficiency of a variety of vision models,ranging from Inception to Neural Architecture Search(NAS)networks.Second,by plotting class activationmaps(CAMs)for individual networks and assessing classification efficiency with AUC-ROC curves,the behavior of these models is visually analyzed.Finally,stacked ensembles techniques were used to provide greater generalization by combining finetuned models with six ensemble neural networks.Using stacked ensembles,the generalization of the models improved.Furthermore,the ensemble model created by combining all of the finetuned networks obtained a state-of-the-art COVID-19 accuracy detection score of 99.17%.The precision and recall rates were 99.99%and 89.79%,respectively,highlighting the robustness of stacked ensembles.The proposed ensemble approach performed well in the classification of the COVID-19 lesions on CXR according to the experimental results.
基金supported by the National Natural Science Foundation of China(Grant No.52078428)the Sichuan Outstanding Young Science and Technology Talent Project,China(Grant No.2020JDJQ0032).
文摘Rockburst is a kind of common geological disaster in deep tunnel engineering.It has the characteristics of causing great harm and occurring at random locations and times.These characteristics seriously affect tunnel construction and threaten the physical and mental health and safety of workers.Therefore,it is of great significance to study the tendency of rockburst in the early stage of tunnel survey,design and construction.At present,there is no unified method and selected parameters for rockburst prediction.In view of the large difference of different rockburst criteria and the imbalance of rockburst database categories,this paper presents a two-step rockburst prediction method based on multiple factors and the stacking ensemble algorithm.Considering the influence of rock physical and mechanical parameters,tunnel face conditions and excavation disturbance,multiple rockburst criteria are predicted by integrating multiple machine learning algorithms.A combined prediction model of rockburst criteria is established,and the results of each rockburst criterion index are weighted and combined,with the weight updated using the field rockburst record.The dynamic weight is combined with the cloud model to comprehensively evaluate the regional rockburst risk.Field results from applying the model in the Grand Canyon tunnel show that the rockburst prediction method proposed in this paper has better applicability and higher accuracy than the single rockburst criterion.
基金funding this work through the Research Group Program under the Grant Number:(R.G.P.2/382/44).
文摘Many plant species have a startling degree of morphological similarity,making it difficult to split and categorize them reliably.Unknown plant species can be challenging to classify and segment using deep learning.While using deep learning architectures has helped improve classification accuracy,the resulting models often need to be more flexible and require a large dataset to train.For the sake of taxonomy,this research proposes a hybrid method for categorizing guava,potato,and java plumleaves.Two new approaches are used to formthe hybridmodel suggested here.The guava,potato,and java plum plant species have been successfully segmented using the first model built on the MobileNetV2-UNET architecture.As a second model,we use a Plant Species Detection Stacking Ensemble Deep Learning Model(PSD-SE-DLM)to identify potatoes,java plums,and guava.The proposed models were trained using data collected in Punjab,Pakistan,consisting of images of healthy and sick leaves from guava,java plum,and potatoes.These datasets are known as PLSD and PLSSD.Accuracy levels of 99.84%and 96.38%were achieved for the suggested PSD-SE-DLM and MobileNetV2-UNET models,respectively.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2024R513),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Cross-Site Scripting(XSS)remains a significant threat to web application security,exploiting vulnerabilities to hijack user sessions and steal sensitive data.Traditional detection methods often fail to keep pace with the evolving sophistication of cyber threats.This paper introduces a novel hybrid ensemble learning framework that leverages a combination of advanced machine learning algorithms—Logistic Regression(LR),Support Vector Machines(SVM),eXtreme Gradient Boosting(XGBoost),Categorical Boosting(CatBoost),and Deep Neural Networks(DNN).Utilizing the XSS-Attacks-2021 dataset,which comprises 460 instances across various real-world trafficrelated scenarios,this framework significantly enhances XSS attack detection.Our approach,which includes rigorous feature engineering and model tuning,not only optimizes accuracy but also effectively minimizes false positives(FP)(0.13%)and false negatives(FN)(0.19%).This comprehensive methodology has been rigorously validated,achieving an unprecedented accuracy of 99.87%.The proposed system is scalable and efficient,capable of adapting to the increasing number of web applications and user demands without a decline in performance.It demonstrates exceptional real-time capabilities,with the ability to detect XSS attacks dynamically,maintaining high accuracy and low latency even under significant loads.Furthermore,despite the computational complexity introduced by the hybrid ensemble approach,strategic use of parallel processing and algorithm tuning ensures that the system remains scalable and performs robustly in real-time applications.Designed for easy integration with existing web security systems,our framework supports adaptable Application Programming Interfaces(APIs)and a modular design,facilitating seamless augmentation of current defenses.This innovation represents a significant advancement in cybersecurity,offering a scalable and effective solution for securing modern web applications against evolving threats.
文摘Due to the availability of a huge number of electronic text documents from a variety of sources representing unstructured and semi-structured information,the document classication task becomes an interesting area for controlling data behavior.This paper presents a document classication multimodal for categorizing textual semi-structured and unstructured documents.The multimodal implements several individual deep learning models such as Deep Neural Networks(DNN),Recurrent Convolutional Neural Networks(RCNN)and Bidirectional-LSTM(Bi-LSTM).The Stacked Ensemble based meta-model technique is used to combine the results of the individual classiers to produce better results,compared to those reached by any of the above mentioned models individually.A series of textual preprocessing steps are executed to normalize the input corpus followed by text vectorization techniques.These techniques include using Term Frequency Inverse Term Frequency(TFIDF)or Continuous Bag of Word(CBOW)to convert text data into the corresponding suitable numeric form acceptable to be manipulated by deep learning models.Moreover,this proposed model is validated using a dataset collected from several spaces with a huge number of documents in every class.In addition,the experimental results prove that the proposed model has achieved effective performance.Besides,upon investigating the PDF Documents classication,the proposed model has achieved accuracy up to 0.9045 and 0.959 for the TFIDF and CBOW features,respectively.Moreover,concerning the JSON Documents classication,the proposed model has achieved accuracy up to 0.914 and 0.956 for the TFIDF and CBOW features,respectively.Furthermore,as for the XML Documents classication,the proposed model has achieved accuracy values up to 0.92 and 0.959 for the TFIDF and CBOW features,respectively.
基金The authors would like to express their gratitude to Taif University,Taif,Saudi Arabia for providing administrative and technical support.This work was supported by the Taif University Researchers supporting Project number(TURSP-2020/254).
文摘This research work proposes a new stack-based generalization ensemble model to forecast the number of incidences of conjunctivitis disease.In addition to forecasting the occurrences of conjunctivitis incidences,the proposed model also improves performance by using the ensemble model.Weekly rate of acute Conjunctivitis per 1000 for Hong Kong is collected for the duration of the first week of January 2010 to the last week of December 2019.Pre-processing techniques such as imputation of missing values and logarithmic transformation are applied to pre-process the data sets.A stacked generalization ensemble model based on Auto-ARIMA(Autoregressive Integrated Moving Average),NNAR(Neural Network Autoregression),ETS(Exponential Smoothing),HW(Holt Winter)is proposed and applied on the dataset.Predictive analysis is conducted on the collected dataset of conjunctivitis disease,and further compared for different performance measures.The result shows that the RMSE(Root Mean Square Error),MAE(Mean Absolute Error),MAPE(Mean Absolute Percentage Error),ACF1(Auto Correlation Function)of the proposed ensemble is decreased significantly.Considering the RMSE,for instance,error values are reduced by 39.23%,9.13%,20.42%,and 17.13%in comparison to Auto-ARIMA,NAR,ETS,and HW model respectively.This research concludes that the accuracy of the forecasting of diseases can be significantly increased by applying the proposed stack generalization ensemble model as it minimizes the prediction error and hence provides better prediction trends as compared to Auto-ARIMA,NAR,ETS,and HW model applied discretely.