期刊文献+
共找到8,276篇文章
< 1 2 250 >
每页显示 20 50 100
基于改进SMOTE算法和Ensemble模型的学习结果预测方法 被引量:1
1
作者 王晓勇 胡胜利 《中北大学学报(自然科学版)》 CAS 2024年第3期257-264,共8页
为解决不同领域的数据分类和预测任务中单个机器学习算法适用性较差的问题,以及缓解数据集严重不平衡对预测性能的影响,提出了基于合成少数类过采样(SMOTE)和Ensemble集成模型的数据分类方法。传统SMOTE算法通过对少数类样本进行插值来... 为解决不同领域的数据分类和预测任务中单个机器学习算法适用性较差的问题,以及缓解数据集严重不平衡对预测性能的影响,提出了基于合成少数类过采样(SMOTE)和Ensemble集成模型的数据分类方法。传统SMOTE算法通过对少数类样本进行插值来生成新的合成样本,合成样本中存在噪声和样本间相似性较高的问题。为此,提出了改进的SMOTE算法,通过距离计算移除噪声样本和易混淆样本,得到高区分度的纯净合成样本。然后,利用Ensemble方法调整样本和分类器权重,并组成分类效果更好的强分类器。在公开在线学习数据集Kalboard360上的实验结果表明,使用极限随机树(ERT)分类器时,结合改进SMOTE和Ensemble模型后实现了97.9%的预测准确度,比单个ERT分类器提升了5.5%,证明所提改进SMOTE算法能够生成高质量的均衡化数据,且集成学习模型的性能显著优于单个机器学习算法。 展开更多
关键词 机器学习 神经网络 数据挖掘 集成学习 数据均衡化 学习结果预测
下载PDF
Non-crossing Quantile Regression Neural Network as a Calibration Tool for Ensemble Weather Forecasts 被引量:1
2
作者 Mengmeng SONG Dazhi YANG +7 位作者 Sebastian LERCH Xiang'ao XIA Gokhan Mert YAGLI Jamie M.BRIGHT Yanbo SHEN Bai LIU Xingli LIU Martin Janos MAYER 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2024年第7期1417-1437,共21页
Despite the maturity of ensemble numerical weather prediction(NWP),the resulting forecasts are still,more often than not,under-dispersed.As such,forecast calibration tools have become popular.Among those tools,quantil... Despite the maturity of ensemble numerical weather prediction(NWP),the resulting forecasts are still,more often than not,under-dispersed.As such,forecast calibration tools have become popular.Among those tools,quantile regression(QR)is highly competitive in terms of both flexibility and predictive performance.Nevertheless,a long-standing problem of QR is quantile crossing,which greatly limits the interpretability of QR-calibrated forecasts.On this point,this study proposes a non-crossing quantile regression neural network(NCQRNN),for calibrating ensemble NWP forecasts into a set of reliable quantile forecasts without crossing.The overarching design principle of NCQRNN is to add on top of the conventional QRNN structure another hidden layer,which imposes a non-decreasing mapping between the combined output from nodes of the last hidden layer to the nodes of the output layer,through a triangular weight matrix with positive entries.The empirical part of the work considers a solar irradiance case study,in which four years of ensemble irradiance forecasts at seven locations,issued by the European Centre for Medium-Range Weather Forecasts,are calibrated via NCQRNN,as well as via an eclectic mix of benchmarking models,ranging from the naïve climatology to the state-of-the-art deep-learning and other non-crossing models.Formal and stringent forecast verification suggests that the forecasts post-processed via NCQRNN attain the maximum sharpness subject to calibration,amongst all competitors.Furthermore,the proposed conception to resolve quantile crossing is remarkably simple yet general,and thus has broad applicability as it can be integrated with many shallow-and deep-learning-based neural networks. 展开更多
关键词 ensemble weather forecasting forecast calibration non-crossing quantile regression neural network CORP reliability diagram POST-PROCESSING
下载PDF
Detection and defending the XSS attack using novel hybrid stacking ensemble learning-based DNN approach 被引量:1
3
作者 Muralitharan Krishnan Yongdo Lim +1 位作者 Seethalakshmi Perumal Gayathri Palanisamy 《Digital Communications and Networks》 SCIE CSCD 2024年第3期716-727,共12页
Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while mod... Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while modifying an organization's or user's information.To avoid these security challenges,this article proposes a novel,all-encompassing combination of machine learning(NB,SVM,k-NN)and deep learning(RNN,CNN,LSTM)frameworks for detecting and defending against XSS attacks with high accuracy and efficiency.Based on the representation,a novel idea for merging stacking ensemble with web applications,termed“hybrid stacking”,is proposed.In order to implement the aforementioned methods,four distinct datasets,each of which contains both safe and unsafe content,are considered.The hybrid detection method can adaptively identify the attacks from the URL,and the defense mechanism inherits the advantages of URL encoding with dictionary-based mapping to improve prediction accuracy,accelerate the training process,and effectively remove the unsafe JScript/JavaScript keywords from the URL.The simulation results show that the proposed hybrid model is more efficient than the existing detection methods.It produces more than 99.5%accurate XSS attack classification results(accuracy,precision,recall,f1_score,and Receiver Operating Characteristic(ROC))and is highly resistant to XSS attacks.In order to ensure the security of the server's information,the proposed hybrid approach is demonstrated in a real-time environment. 展开更多
关键词 Machine learning Deep neural networks Classification Stacking ensemble XSS attack URL encoding JScript/JavaScript Web security
下载PDF
A New Speed Limit Recognition Methodology Based on Ensemble Learning:Hardware Validation 被引量:1
4
作者 Mohamed Karray Nesrine Triki Mohamed Ksantini 《Computers, Materials & Continua》 SCIE EI 2024年第7期119-138,共20页
Advanced DriverAssistance Systems(ADAS)technologies can assist drivers or be part of automatic driving systems to support the driving process and improve the level of safety and comfort on the road.Traffic Sign Recogn... Advanced DriverAssistance Systems(ADAS)technologies can assist drivers or be part of automatic driving systems to support the driving process and improve the level of safety and comfort on the road.Traffic Sign Recognition System(TSRS)is one of themost important components ofADAS.Among the challengeswith TSRS is being able to recognize road signs with the highest accuracy and the shortest processing time.Accordingly,this paper introduces a new real time methodology recognizing Speed Limit Signs based on a trio of developed modules.Firstly,the Speed Limit Detection(SLD)module uses the Haar Cascade technique to generate a new SL detector in order to localize SL signs within captured frames.Secondly,the Speed Limit Classification(SLC)module,featuring machine learning classifiers alongside a newly developed model called DeepSL,harnesses the power of a CNN architecture to extract intricate features from speed limit sign images,ensuring efficient and precise recognition.In addition,a new Speed Limit Classifiers Fusion(SLCF)module has been developed by combining trained ML classifiers and the DeepSL model by using the Dempster-Shafer theory of belief functions and ensemble learning’s voting technique.Through rigorous software and hardware validation processes,the proposedmethodology has achieved highly significant F1 scores of 99.98%and 99.96%for DS theory and the votingmethod,respectively.Furthermore,a prototype encompassing all components demonstrates outstanding reliability and efficacy,with processing times of 150 ms for the Raspberry Pi board and 81.5 ms for the Nano Jetson board,marking a significant advancement in TSRS technology. 展开更多
关键词 Driving automation advanced driver assistance systems(ADAS) traffic sign recognition(TSR) artificial intelligence ensemble learning belief functions voting method
下载PDF
Intrusion Detection System for Smart Industrial Environments with Ensemble Feature Selection and Deep Convolutional Neural Networks 被引量:1
5
作者 Asad Raza Shahzad Memon +1 位作者 Muhammad Ali Nizamani Mahmood Hussain Shah 《Intelligent Automation & Soft Computing》 2024年第3期545-566,共22页
Smart Industrial environments use the Industrial Internet of Things(IIoT)for their routine operations and transform their industrial operations with intelligent and driven approaches.However,IIoT devices are vulnerabl... Smart Industrial environments use the Industrial Internet of Things(IIoT)for their routine operations and transform their industrial operations with intelligent and driven approaches.However,IIoT devices are vulnerable to cyber threats and exploits due to their connectivity with the internet.Traditional signature-based IDS are effective in detecting known attacks,but they are unable to detect unknown emerging attacks.Therefore,there is the need for an IDS which can learn from data and detect new threats.Ensemble Machine Learning(ML)and individual Deep Learning(DL)based IDS have been developed,and these individual models achieved low accuracy;however,their performance can be improved with the ensemble stacking technique.In this paper,we have proposed a Deep Stacked Neural Network(DSNN)based IDS,which consists of two stacked Convolutional Neural Network(CNN)models as base learners and Extreme Gradient Boosting(XGB)as the meta learner.The proposed DSNN model was trained and evaluated with the next-generation dataset,TON_IoT.Several pre-processing techniques were applied to prepare a dataset for the model,including ensemble feature selection and the SMOTE technique.Accuracy,precision,recall,F1-score,and false positive rates were used to evaluate the performance of the proposed ensemble model.Our experimental results showed that the accuracy for binary classification is 99.61%,which is better than in the baseline individual DL and ML models.In addition,the model proposed for IDS has been compared with similar models.The proposed DSNN achieved better performance metrics than the other models.The proposed DSNN model will be used to develop enhanced IDS for threat mitigation in smart industrial environments. 展开更多
关键词 Industrial internet of things smart industrial environment cyber-attacks convolutional neural network ensemble learning
下载PDF
Machine learning ensemble model prediction of northward shift in potato cyst nematodes(Globodera rostochiensis and G.pallida)distribution under climate change conditions
6
作者 Yitong He Guanjin Wang +3 位作者 Yonglin Ren Shan Gao Dong Chu Simon J.McKirdy 《Journal of Integrative Agriculture》 SCIE CAS CSCD 2024年第10期3576-3591,共16页
Potato cyst nematodes(PCNs)are a significant threat to potato production,having caused substantial damage in many countries.Predicting the future distribution of PCN species is crucial to implementing effective biosec... Potato cyst nematodes(PCNs)are a significant threat to potato production,having caused substantial damage in many countries.Predicting the future distribution of PCN species is crucial to implementing effective biosecurity strategies,especially given the impact of climate change on pest species invasion and distribution.Machine learning(ML),specifically ensemble models,has emerged as a powerful tool in predicting species distributions due to its ability to learn and make predictions based on complex data sets.Thus,this research utilised advanced machine learning techniques to predict the distribution of PCN species under climate change conditions,providing the initial element for invasion risk assessment.We first used Global Climate Models to generate homogeneous climate predictors to mitigate the variation among predictors.Then,five machine learning models were employed to build two groups of ensembles,single-algorithm ensembles(ESA)and multi-algorithm ensembles(EMA),and compared their performances.In this research,the EMA did not always perform better than the ESA,and the ESA of Artificial Neural Network gave the highest performance while being cost-effective.Prediction results indicated that the distribution range of PCNs would shift northward with a decrease in tropical zones and an increase in northern latitudes.However,the total area of suitable regions will not change significantly,occupying 16-20%of the total land surface(18%under current conditions).This research alerts policymakers and practitioners to the risk of PCNs’incursion into new regions.Additionally,this ML process offers the capability to track changes in the distribution of various species and provides scientifically grounded evidence for formulating long-term biosecurity plans for their control. 展开更多
关键词 invasive species distribution future climates homogeneous climate predictors single-algorithm ensembles multi-algorithm ensembles artificial neural network
下载PDF
Statistical Process Monitoring Based on Ensemble Structure Analysis
7
作者 Likang Shi Chudong Tong +1 位作者 Ting Lan Xuhua Shi 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第8期1889-1891,共3页
Dear Editor,This letter presents a novel process monitoring model based on ensemble structure analysis(ESA).The ESA model takes advantage of principal component analysis(PCA),locality preserving projections(LPP),and m... Dear Editor,This letter presents a novel process monitoring model based on ensemble structure analysis(ESA).The ESA model takes advantage of principal component analysis(PCA),locality preserving projections(LPP),and multi-manifold projections(MMP)models,and then combines the multiple solutions within an ensemble result through Bayesian inference.In the developed ESA model,different structure features of the given dataset are taken into account simultaneously,the suitability and reliability of the ESA-based monitoring model are then illustrated through comparison.Introduction:The requirement for ensuring safe operation and improving process efficiency has led to increased research activity in the field of process monitoring. 展开更多
关键词 ensemble PRESERVING LETTER
下载PDF
EDSUCh:A robust ensemble data summarization method for effective medical diagnosis
8
作者 Mohiuddin Ahmed A.N.M.Bazlur Rashid 《Digital Communications and Networks》 SCIE CSCD 2024年第1期182-189,共8页
Identifying rare patterns for medical diagnosis is a challenging task due to heterogeneity and the volume of data.Data summarization can create a concise version of the original data that can be used for effective dia... Identifying rare patterns for medical diagnosis is a challenging task due to heterogeneity and the volume of data.Data summarization can create a concise version of the original data that can be used for effective diagnosis.In this paper,we propose an ensemble summarization method that combines clustering and sampling to create a summary of the original data to ensure the inclusion of rare patterns.To the best of our knowledge,there has been no such technique available to augment the performance of anomaly detection techniques and simultaneously increase the efficiency of medical diagnosis.The performance of popular anomaly detection algorithms increases significantly in terms of accuracy and computational complexity when the summaries are used.Therefore,the medical diagnosis becomes more effective,and our experimental results reflect that the combination of the proposed summarization scheme and all underlying algorithms used in this paper outperforms the most popular anomaly detection techniques. 展开更多
关键词 Data summarization ensemble Medical diagnosis Sampling
下载PDF
基于Local Cascade Ensemble方法的胎儿健康自动分类
9
作者 黄梅佳 李宗辉 郑博伟 《信息技术与信息化》 2024年第4期122-125,共4页
为更好地自动评估胎儿宫内状态,提出一种基于local cascade ensemble(LCE)方法的胎儿健康状态分类模型。选用UCI数据集,使用ADASYN方法对不平衡数据集进行填充平衡,接着结合随机森林算法对数据特征进行选择,最后使用LCE方法对胎儿状态... 为更好地自动评估胎儿宫内状态,提出一种基于local cascade ensemble(LCE)方法的胎儿健康状态分类模型。选用UCI数据集,使用ADASYN方法对不平衡数据集进行填充平衡,接着结合随机森林算法对数据特征进行选择,最后使用LCE方法对胎儿状态进行自动分类。实验结果表明,所提出模型使用的方法平均准确率、精确率、召回率和F1分数分别达到了0.9554、0.9054、0.9557和0.9290,对比传统的机器学习算法能得到更好的分类效果,有效降低了误判率。 展开更多
关键词 机器学习 胎儿监护 自动分类 Local Cascade ensemble
下载PDF
Entropy of deterministic trajectory via trajectories ensemble
10
作者 彭勇刚 冉翠平 郑雨军 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第6期347-354,共8页
We present a formulation of the single-trajectory entropy using the trajectories ensemble. The single-trajectory entropy is affected by its surrounding trajectories via the distribution function. The single-trajectory... We present a formulation of the single-trajectory entropy using the trajectories ensemble. The single-trajectory entropy is affected by its surrounding trajectories via the distribution function. The single-trajectory entropies are studied in two typical potentials, i.e., harmonic potential and double-well potential, and in viscous environment by interacting trajectory method. The results of the trajectory methods are in agreement well with the numerical methods(Monte Carlo simulation and difference equation). The single-trajectory entropies increasing(decreasing) could be caused by absorption(emission) heat from(to) the thermal environment. Also, some interesting trajectories, which correspond to the rare evens in the processes, are demonstrated. 展开更多
关键词 trajectory entropy trajectories ensemble
下载PDF
A redundant subspace weighting procedure for clock ensemble
11
作者 徐海 陈煜 +1 位作者 刘默驰 王玉琢 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第4期435-442,共8页
A redundant-subspace-weighting(RSW)-based approach is proposed to enhance the frequency stability on a time scale of a clock ensemble.In this method,multiple overlapping subspaces are constructed in the clock ensemble... A redundant-subspace-weighting(RSW)-based approach is proposed to enhance the frequency stability on a time scale of a clock ensemble.In this method,multiple overlapping subspaces are constructed in the clock ensemble,and the weight of each clock in this ensemble is defined by using the spatial covariance matrix.The superimposition average of covariances in different subspaces reduces the correlations between clocks in the same laboratory to some extent.After optimizing the parameters of this weighting procedure,the frequency stabilities of virtual clock ensembles are significantly improved in most cases. 展开更多
关键词 weighting method redundant subspace clock ensemble time scale
下载PDF
Securing Cloud-Encrypted Data:Detecting Ransomware-as-a-Service(RaaS)Attacks through Deep Learning Ensemble
12
作者 Amardeep Singh Hamad Ali Abosaq +5 位作者 Saad Arif Zohaib Mushtaq Muhammad Irfan Ghulam Abbas Arshad Ali Alanoud Al Mazroa 《Computers, Materials & Continua》 SCIE EI 2024年第4期857-873,共17页
Data security assurance is crucial due to the increasing prevalence of cloud computing and its widespread use across different industries,especially in light of the growing number of cybersecurity threats.A major and ... Data security assurance is crucial due to the increasing prevalence of cloud computing and its widespread use across different industries,especially in light of the growing number of cybersecurity threats.A major and everpresent threat is Ransomware-as-a-Service(RaaS)assaults,which enable even individuals with minimal technical knowledge to conduct ransomware operations.This study provides a new approach for RaaS attack detection which uses an ensemble of deep learning models.For this purpose,the network intrusion detection dataset“UNSWNB15”from the Intelligent Security Group of the University of New South Wales,Australia is analyzed.In the initial phase,the rectified linear unit-,scaled exponential linear unit-,and exponential linear unit-based three separate Multi-Layer Perceptron(MLP)models are developed.Later,using the combined predictive power of these three MLPs,the RansoDetect Fusion ensemble model is introduced in the suggested methodology.The proposed ensemble technique outperforms previous studieswith impressive performance metrics results,including 98.79%accuracy and recall,98.85%precision,and 98.80%F1-score.The empirical results of this study validate the ensemble model’s ability to improve cybersecurity defenses by showing that it outperforms individual MLPmodels.In expanding the field of cybersecurity strategy,this research highlights the significance of combined deep learning models in strengthening intrusion detection systems against sophisticated cyber threats. 展开更多
关键词 Cloud encryption RAAS ensemble threat detection deep learning CYBERSECURITY
下载PDF
Ensemble Approach Combining Deep Residual Networks and BiGRU with Attention Mechanism for Classification of Heart Arrhythmias
13
作者 Batyrkhan Omarov Meirzhan Baikuvekov +3 位作者 Daniyar Sultan Nurzhan Mukazhanov Madina Suleimenova Maigul Zhekambayeva 《Computers, Materials & Continua》 SCIE EI 2024年第7期341-359,共19页
This research introduces an innovative ensemble approach,combining Deep Residual Networks(ResNets)and Bidirectional Gated Recurrent Units(BiGRU),augmented with an Attention Mechanism,for the classification of heart ar... This research introduces an innovative ensemble approach,combining Deep Residual Networks(ResNets)and Bidirectional Gated Recurrent Units(BiGRU),augmented with an Attention Mechanism,for the classification of heart arrhythmias.The escalating prevalence of cardiovascular diseases necessitates advanced diagnostic tools to enhance accuracy and efficiency.The model leverages the deep hierarchical feature extraction capabilities of ResNets,which are adept at identifying intricate patterns within electrocardiogram(ECG)data,while BiGRU layers capture the temporal dynamics essential for understanding the sequential nature of ECG signals.The integration of an Attention Mechanism refines the model’s focus on critical segments of ECG data,ensuring a nuanced analysis that highlights the most informative features for arrhythmia classification.Evaluated on a comprehensive dataset of 12-lead ECG recordings,our ensemble model demonstrates superior performance in distinguishing between various types of arrhythmias,with an accuracy of 98.4%,a precision of 98.1%,a recall of 98%,and an F-score of 98%.This novel combination of convolutional and recurrent neural networks,supplemented by attention-driven mechanisms,advances automated ECG analysis,contributing significantly to healthcare’s machine learning applications and presenting a step forward in developing non-invasive,efficient,and reliable tools for early diagnosis and management of heart diseases. 展开更多
关键词 CNN BiGRU ensemble deep learning ECG ARRHYTHMIA heart disease
下载PDF
Physics-Constrained Robustness Enhancement for Tree Ensembles Applied in Smart Grid
14
作者 Zhibo Yang Xiaohan Huang +2 位作者 Bingdong Wang Bin Hu Zhenyong Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第8期3001-3019,共19页
With the widespread use of machine learning(ML)technology,the operational efficiency and responsiveness of power grids have been significantly enhanced,allowing smart grids to achieve high levels of automation and int... With the widespread use of machine learning(ML)technology,the operational efficiency and responsiveness of power grids have been significantly enhanced,allowing smart grids to achieve high levels of automation and intelligence.However,tree ensemble models commonly used in smart grids are vulnerable to adversarial attacks,making it urgent to enhance their robustness.To address this,we propose a robustness enhancement method that incorporates physical constraints into the node-splitting decisions of tree ensembles.Our algorithm improves robustness by developing a dataset of adversarial examples that comply with physical laws,ensuring training data accurately reflects possible attack scenarios while adhering to physical rules.In our experiments,the proposed method increased robustness against adversarial attacks by 100%when applied to real grid data under physical constraints.These results highlight the advantages of our method in maintaining efficient and secure operation of smart grids under adversarial conditions. 展开更多
关键词 Tree ensemble robustness enhancement adversarial attack smart grid
下载PDF
An Initial Perturbation Method for the Multiscale Singular Vector in Global Ensemble Prediction
15
作者 Xin LIU Jing CHEN +6 位作者 Yongzhu LIU Zhenhua HUO Zhizhen XU Fajing CHEN Jing WANG Yanan MA Yumeng HAN 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2024年第3期545-563,共19页
Ensemble prediction is widely used to represent the uncertainty of single deterministic Numerical Weather Prediction(NWP) caused by errors in initial conditions(ICs). The traditional Singular Vector(SV) initial pertur... Ensemble prediction is widely used to represent the uncertainty of single deterministic Numerical Weather Prediction(NWP) caused by errors in initial conditions(ICs). The traditional Singular Vector(SV) initial perturbation method tends only to capture synoptic scale initial uncertainty rather than mesoscale uncertainty in global ensemble prediction. To address this issue, a multiscale SV initial perturbation method based on the China Meteorological Administration Global Ensemble Prediction System(CMA-GEPS) is proposed to quantify multiscale initial uncertainty. The multiscale SV initial perturbation approach entails calculating multiscale SVs at different resolutions with multiple linearized physical processes to capture fast-growing perturbations from mesoscale to synoptic scale in target areas and combining these SVs by using a Gaussian sampling method with amplitude coefficients to generate initial perturbations. Following that, the energy norm,energy spectrum, and structure of multiscale SVs and their impact on GEPS are analyzed based on a batch experiment in different seasons. The results show that the multiscale SV initial perturbations can possess more energy and capture more mesoscale uncertainties than the traditional single-SV method. Meanwhile, multiscale SV initial perturbations can reflect the strongest dynamical instability in target areas. Their performances in global ensemble prediction when compared to single-scale SVs are shown to(i) improve the relationship between the ensemble spread and the root-mean-square error and(ii) provide a better probability forecast skill for atmospheric circulation during the late forecast period and for short-to medium-range precipitation. This study provides scientific evidence and application foundations for the design and development of a multiscale SV initial perturbation method for the GEPS. 展开更多
关键词 multiscale uncertainty singular vector initial perturbation global ensemble prediction system
下载PDF
Classification and Comprehension of Software Requirements Using Ensemble Learning
16
作者 Jalil Abbas Arshad Ahmad +4 位作者 Syed Muqsit Shaheed Rubia Fatima Sajid Shah Mohammad Elaffendi Gauhar Ali 《Computers, Materials & Continua》 SCIE EI 2024年第8期2839-2855,共17页
The software development process mostly depends on accurately identifying both essential and optional features.Initially,user needs are typically expressed in free-form language,requiring significant time and human re... The software development process mostly depends on accurately identifying both essential and optional features.Initially,user needs are typically expressed in free-form language,requiring significant time and human resources to translate these into clear functional and non-functional requirements.To address this challenge,various machine learning(ML)methods have been explored to automate the understanding of these requirements,aiming to reduce time and human effort.However,existing techniques often struggle with complex instructions and large-scale projects.In our study,we introduce an innovative approach known as the Functional and Non-functional Requirements Classifier(FNRC).By combining the traditional random forest algorithm with the Accuracy Sliding Window(ASW)technique,we develop optimal sub-ensembles that surpass the initial classifier’s accuracy while using fewer trees.Experimental results demonstrate that our FNRC methodology performs robustly across different datasets,achieving a balanced Precision of 75%on the PROMISE dataset and an impressive Recall of 85%on the CCHIT dataset.Both datasets consistently maintain an F-measure around 64%,highlighting FNRC’s ability to effectively balance precision and recall in diverse scenarios.These findings contribute to more accurate and efficient software development processes,increasing the probability of achieving successful project outcomes. 展开更多
关键词 ensemble learning machine learning non-functional requirements requirement engineering accuracy sliding window
下载PDF
Application of multi-algorithm ensemble methods in high-dimensional and small-sample data of geotechnical engineering:A case study of swelling pressure of expansive soils
17
作者 Chao Li Lei Wang +1 位作者 Jie Li Yang Chen 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第5期1896-1917,共22页
Geotechnical engineering data are usually small-sample and high-dimensional,which brings a lot of challenges in predictive modeling.This paper uses a typical high-dimensional and small-sample swell pressure(P_(s))data... Geotechnical engineering data are usually small-sample and high-dimensional,which brings a lot of challenges in predictive modeling.This paper uses a typical high-dimensional and small-sample swell pressure(P_(s))dataset to explore the possibility of using multi-algorithm hybrid ensemble and dimensionality reduction methods to mitigate the uncertainty of soil parameter prediction.Based on six machine learning(ML)algorithms,the base learner pool is constructed,and four ensemble methods,Stacking(SG),Blending(BG),Voting regression(VR),and Feature weight linear stacking(FWL),are used for the multi-algorithm ensemble.Furthermore,the importance of permutation is used for feature dimensionality reduction to mitigate the impact of weakly correlated variables on predictive modeling.The results show that the proposed methods are superior to traditional prediction models and base ML models,where FWL is more suitable for modeling with small-sample datasets,and dimensionality reduction can simplify the data structure and reduce the adverse impact of the small-sample effect,which points the way to feature selection for predictive modeling.Based on the ensemble methods,the feature importance of the five primary factors affecting P_(s) is the maximum dry density(31.145%),clay fraction(15.876%),swell percent(15.289%),plasticity index(14%),and optimum moisture content(13.69%),the influence of input parameters on P_(s) is also investigated,in line with the findings of the existing literature. 展开更多
关键词 Expansive soils Swelling pressure Machine learning(ML) Multi-algorithm ensemble Sensitivity analysis
下载PDF
Software Reliability Prediction Using Ensemble Learning on Selected Features in Imbalanced and Balanced Datasets: A Review
18
作者 Suneel Kumar Rath Madhusmita Sahu +5 位作者 Shom Prasad Das Junali Jasmine Jena Chitralekha Jena Baseem Khan Ahmed Ali Pitshou Bokoro 《Computer Systems Science & Engineering》 2024年第6期1513-1536,共24页
Redundancy,correlation,feature irrelevance,and missing samples are just a few problems that make it difficult to analyze software defect data.Additionally,it might be challenging to maintain an even distribution of da... Redundancy,correlation,feature irrelevance,and missing samples are just a few problems that make it difficult to analyze software defect data.Additionally,it might be challenging to maintain an even distribution of data relating to both defective and non-defective software.The latter software class’s data are predominately present in the dataset in the majority of experimental situations.The objective of this review study is to demonstrate the effectiveness of combining ensemble learning and feature selection in improving the performance of defect classification.Besides the successful feature selection approach,a novel variant of the ensemble learning technique is analyzed to address the challenges of feature redundancy and data imbalance,providing robustness in the classification process.To overcome these problems and lessen their impact on the fault classification performance,authors carefully integrate effective feature selection with ensemble learning models.Forward selection demonstrates that a significant area under the receiver operating curve(ROC)can be attributed to only a small subset of features.The Greedy forward selection(GFS)technique outperformed Pearson’s correlation method when evaluating feature selection techniques on the datasets.Ensemble learners,such as random forests(RF)and the proposed average probability ensemble(APE),demonstrate greater resistance to the impact of weak features when compared to weighted support vector machines(W-SVMs)and extreme learning machines(ELM).Furthermore,in the case of the NASA and Java datasets,the enhanced average probability ensemble model,which incorporates the Greedy forward selection technique with the average probability ensemble model,achieved remarkably high accuracy for the area under the ROC.It approached a value of 1.0,indicating exceptional performance.This review emphasizes the importance of meticulously selecting attributes in a software dataset to accurately classify damaged components.In addition,the suggested ensemble learning model successfully addressed the aforementioned problems with software data and produced outstanding classification performance. 展开更多
关键词 ensemble classifier hybrid classifier software reliability prediction
下载PDF
Improving Thyroid Disorder Diagnosis via Ensemble Stacking and Bidirectional Feature Selection
19
作者 Muhammad Armghan Latif Zohaib Mushtaq +6 位作者 Saad Arif Sara Rehman Muhammad Farrukh Qureshi Nagwan Abdel Samee Maali Alabdulhafith Yeong Hyeon Gu Mohammed A.Al-masni 《Computers, Materials & Continua》 SCIE EI 2024年第3期4225-4241,共17页
Thyroid disorders represent a significant global health challenge with hypothyroidism and hyperthyroidism as two common conditions arising from dysfunction in the thyroid gland.Accurate and timely diagnosis of these d... Thyroid disorders represent a significant global health challenge with hypothyroidism and hyperthyroidism as two common conditions arising from dysfunction in the thyroid gland.Accurate and timely diagnosis of these disorders is crucial for effective treatment and patient care.This research introduces a comprehensive approach to improve the accuracy of thyroid disorder diagnosis through the integration of ensemble stacking and advanced feature selection techniques.Sequential forward feature selection,sequential backward feature elimination,and bidirectional feature elimination are investigated in this study.In ensemble learning,random forest,adaptive boosting,and bagging classifiers are employed.The effectiveness of these techniques is evaluated using two different datasets obtained from the University of California Irvine-Machine Learning Repository,both of which undergo preprocessing steps,including outlier removal,addressing missing data,data cleansing,and feature reduction.Extensive experimentation demonstrates the remarkable success of proposed ensemble stacking and bidirectional feature elimination achieving 100%and 99.86%accuracy in identifying hyperthyroidism and hypothyroidism,respectively.Beyond enhancing detection accuracy,the ensemble stacking model also demonstrated a streamlined computational complexity which is pivotal for practical medical applications.It significantly outperformed existing studies with similar objectives underscoring the viability and effectiveness of the proposed scheme.This research offers an innovative perspective and sets the platform for improved thyroid disorder diagnosis with broader implications for healthcare and patient well-being. 展开更多
关键词 ensemble learning random forests BOOSTING dimensionality reduction machine learning smart healthcare computer aided diagnosis
下载PDF
Neural network study of the nuclear ground-state spin distribution within a random interaction ensemble
20
作者 Deng Liu Alam Noor A +1 位作者 Zhen-Zhen Qin Yang Lei 《Nuclear Science and Techniques》 SCIE EI CAS CSCD 2024年第3期216-227,共12页
The distribution of the nuclear ground-state spin in a two-body random ensemble(TBRE)was studied using a general classification neural network(NN)model with two-body interaction matrix elements as input features and t... The distribution of the nuclear ground-state spin in a two-body random ensemble(TBRE)was studied using a general classification neural network(NN)model with two-body interaction matrix elements as input features and the corresponding ground-state spins as labels or output predictions.The quantum many-body system problem exceeds the capability of our optimized NNs in terms of accurately predicting the ground-state spin of each sample within the TBRE.However,our NN model effectively captured the statistical properties of the ground-state spin because it learned the empirical regularity of the ground-state spin distribution in TBRE,as discovered by physicists. 展开更多
关键词 Neural network Two-body random ensemble Spin distribution of nuclear ground state
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部