期刊文献+
共找到7,906篇文章
< 1 2 250 >
每页显示 20 50 100
基于改进SMOTE算法和Ensemble模型的学习结果预测方法
1
作者 王晓勇 胡胜利 《中北大学学报(自然科学版)》 CAS 2024年第3期257-264,共8页
为解决不同领域的数据分类和预测任务中单个机器学习算法适用性较差的问题,以及缓解数据集严重不平衡对预测性能的影响,提出了基于合成少数类过采样(SMOTE)和Ensemble集成模型的数据分类方法。传统SMOTE算法通过对少数类样本进行插值来... 为解决不同领域的数据分类和预测任务中单个机器学习算法适用性较差的问题,以及缓解数据集严重不平衡对预测性能的影响,提出了基于合成少数类过采样(SMOTE)和Ensemble集成模型的数据分类方法。传统SMOTE算法通过对少数类样本进行插值来生成新的合成样本,合成样本中存在噪声和样本间相似性较高的问题。为此,提出了改进的SMOTE算法,通过距离计算移除噪声样本和易混淆样本,得到高区分度的纯净合成样本。然后,利用Ensemble方法调整样本和分类器权重,并组成分类效果更好的强分类器。在公开在线学习数据集Kalboard360上的实验结果表明,使用极限随机树(ERT)分类器时,结合改进SMOTE和Ensemble模型后实现了97.9%的预测准确度,比单个ERT分类器提升了5.5%,证明所提改进SMOTE算法能够生成高质量的均衡化数据,且集成学习模型的性能显著优于单个机器学习算法。 展开更多
关键词 机器学习 神经网络 数据挖掘 集成学习 数据均衡化 学习结果预测
下载PDF
基于Local Cascade Ensemble方法的胎儿健康自动分类
2
作者 黄梅佳 李宗辉 郑博伟 《信息技术与信息化》 2024年第4期122-125,共4页
为更好地自动评估胎儿宫内状态,提出一种基于local cascade ensemble(LCE)方法的胎儿健康状态分类模型。选用UCI数据集,使用ADASYN方法对不平衡数据集进行填充平衡,接着结合随机森林算法对数据特征进行选择,最后使用LCE方法对胎儿状态... 为更好地自动评估胎儿宫内状态,提出一种基于local cascade ensemble(LCE)方法的胎儿健康状态分类模型。选用UCI数据集,使用ADASYN方法对不平衡数据集进行填充平衡,接着结合随机森林算法对数据特征进行选择,最后使用LCE方法对胎儿状态进行自动分类。实验结果表明,所提出模型使用的方法平均准确率、精确率、召回率和F1分数分别达到了0.9554、0.9054、0.9557和0.9290,对比传统的机器学习算法能得到更好的分类效果,有效降低了误判率。 展开更多
关键词 机器学习 胎儿监护 自动分类 Local Cascade ensemble
下载PDF
Entropy of deterministic trajectory via trajectories ensemble
3
作者 彭勇刚 冉翠平 郑雨军 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第6期347-354,共8页
We present a formulation of the single-trajectory entropy using the trajectories ensemble. The single-trajectory entropy is affected by its surrounding trajectories via the distribution function. The single-trajectory... We present a formulation of the single-trajectory entropy using the trajectories ensemble. The single-trajectory entropy is affected by its surrounding trajectories via the distribution function. The single-trajectory entropies are studied in two typical potentials, i.e., harmonic potential and double-well potential, and in viscous environment by interacting trajectory method. The results of the trajectory methods are in agreement well with the numerical methods(Monte Carlo simulation and difference equation). The single-trajectory entropies increasing(decreasing) could be caused by absorption(emission) heat from(to) the thermal environment. Also, some interesting trajectories, which correspond to the rare evens in the processes, are demonstrated. 展开更多
关键词 trajectory entropy trajectories ensemble
下载PDF
A redundant subspace weighting procedure for clock ensemble
4
作者 徐海 陈煜 +1 位作者 刘默驰 王玉琢 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第4期435-442,共8页
A redundant-subspace-weighting(RSW)-based approach is proposed to enhance the frequency stability on a time scale of a clock ensemble.In this method,multiple overlapping subspaces are constructed in the clock ensemble... A redundant-subspace-weighting(RSW)-based approach is proposed to enhance the frequency stability on a time scale of a clock ensemble.In this method,multiple overlapping subspaces are constructed in the clock ensemble,and the weight of each clock in this ensemble is defined by using the spatial covariance matrix.The superimposition average of covariances in different subspaces reduces the correlations between clocks in the same laboratory to some extent.After optimizing the parameters of this weighting procedure,the frequency stabilities of virtual clock ensembles are significantly improved in most cases. 展开更多
关键词 weighting method redundant subspace clock ensemble time scale
下载PDF
Securing Cloud-Encrypted Data:Detecting Ransomware-as-a-Service(RaaS)Attacks through Deep Learning Ensemble
5
作者 Amardeep Singh Hamad Ali Abosaq +5 位作者 Saad Arif Zohaib Mushtaq Muhammad Irfan Ghulam Abbas Arshad Ali Alanoud AlMazroa 《Computers, Materials & Continua》 SCIE EI 2024年第4期857-873,共17页
Data security assurance is crucial due to the increasing prevalence of cloud computing and its widespread use across different industries,especially in light of the growing number of cybersecurity threats.A major and ... Data security assurance is crucial due to the increasing prevalence of cloud computing and its widespread use across different industries,especially in light of the growing number of cybersecurity threats.A major and everpresent threat is Ransomware-as-a-Service(RaaS)assaults,which enable even individuals with minimal technical knowledge to conduct ransomware operations.This study provides a new approach for RaaS attack detection which uses an ensemble of deep learning models.For this purpose,the network intrusion detection dataset“UNSWNB15”from the Intelligent Security Group of the University of New South Wales,Australia is analyzed.In the initial phase,the rectified linear unit-,scaled exponential linear unit-,and exponential linear unit-based three separate Multi-Layer Perceptron(MLP)models are developed.Later,using the combined predictive power of these three MLPs,the RansoDetect Fusion ensemble model is introduced in the suggested methodology.The proposed ensemble technique outperforms previous studieswith impressive performance metrics results,including 98.79%accuracy and recall,98.85%precision,and 98.80%F1-score.The empirical results of this study validate the ensemble model’s ability to improve cybersecurity defenses by showing that it outperforms individual MLPmodels.In expanding the field of cybersecurity strategy,this research highlights the significance of combined deep learning models in strengthening intrusion detection systems against sophisticated cyber threats. 展开更多
关键词 Cloud encryption RAAS ensemble threat detection deep learning CYBERSECURITY
下载PDF
An Initial Perturbation Method for the Multiscale Singular Vector in Global Ensemble Prediction
6
作者 Xin LIU Jing CHEN +6 位作者 Yongzhu LIU Zhenhua HUO Zhizhen XU Fajing CHEN Jing WANG Yanan MA Yumeng HAN 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2024年第3期545-563,共19页
Ensemble prediction is widely used to represent the uncertainty of single deterministic Numerical Weather Prediction(NWP) caused by errors in initial conditions(ICs). The traditional Singular Vector(SV) initial pertur... Ensemble prediction is widely used to represent the uncertainty of single deterministic Numerical Weather Prediction(NWP) caused by errors in initial conditions(ICs). The traditional Singular Vector(SV) initial perturbation method tends only to capture synoptic scale initial uncertainty rather than mesoscale uncertainty in global ensemble prediction. To address this issue, a multiscale SV initial perturbation method based on the China Meteorological Administration Global Ensemble Prediction System(CMA-GEPS) is proposed to quantify multiscale initial uncertainty. The multiscale SV initial perturbation approach entails calculating multiscale SVs at different resolutions with multiple linearized physical processes to capture fast-growing perturbations from mesoscale to synoptic scale in target areas and combining these SVs by using a Gaussian sampling method with amplitude coefficients to generate initial perturbations. Following that, the energy norm,energy spectrum, and structure of multiscale SVs and their impact on GEPS are analyzed based on a batch experiment in different seasons. The results show that the multiscale SV initial perturbations can possess more energy and capture more mesoscale uncertainties than the traditional single-SV method. Meanwhile, multiscale SV initial perturbations can reflect the strongest dynamical instability in target areas. Their performances in global ensemble prediction when compared to single-scale SVs are shown to(i) improve the relationship between the ensemble spread and the root-mean-square error and(ii) provide a better probability forecast skill for atmospheric circulation during the late forecast period and for short-to medium-range precipitation. This study provides scientific evidence and application foundations for the design and development of a multiscale SV initial perturbation method for the GEPS. 展开更多
关键词 multiscale uncertainty singular vector initial perturbation global ensemble prediction system
下载PDF
Application of multi-algorithm ensemble methods in high-dimensional and small-sample data of geotechnical engineering:A case study of swelling pressure of expansive soils
7
作者 Chao Li Lei Wang +1 位作者 Jie Li Yang Chen 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第5期1896-1917,共22页
Geotechnical engineering data are usually small-sample and high-dimensional,which brings a lot of challenges in predictive modeling.This paper uses a typical high-dimensional and small-sample swell pressure(P_(s))data... Geotechnical engineering data are usually small-sample and high-dimensional,which brings a lot of challenges in predictive modeling.This paper uses a typical high-dimensional and small-sample swell pressure(P_(s))dataset to explore the possibility of using multi-algorithm hybrid ensemble and dimensionality reduction methods to mitigate the uncertainty of soil parameter prediction.Based on six machine learning(ML)algorithms,the base learner pool is constructed,and four ensemble methods,Stacking(SG),Blending(BG),Voting regression(VR),and Feature weight linear stacking(FWL),are used for the multi-algorithm ensemble.Furthermore,the importance of permutation is used for feature dimensionality reduction to mitigate the impact of weakly correlated variables on predictive modeling.The results show that the proposed methods are superior to traditional prediction models and base ML models,where FWL is more suitable for modeling with small-sample datasets,and dimensionality reduction can simplify the data structure and reduce the adverse impact of the small-sample effect,which points the way to feature selection for predictive modeling.Based on the ensemble methods,the feature importance of the five primary factors affecting P_(s) is the maximum dry density(31.145%),clay fraction(15.876%),swell percent(15.289%),plasticity index(14%),and optimum moisture content(13.69%),the influence of input parameters on P_(s) is also investigated,in line with the findings of the existing literature. 展开更多
关键词 Expansive soils Swelling pressure Machine learning(ML) Multi-algorithm ensemble Sensitivity analysis
下载PDF
Neural network study of the nuclear ground-state spin distribution within a random interaction ensemble
8
作者 Deng Liu Alam Noor A +1 位作者 Zhen-Zhen Qin Yang Lei 《Nuclear Science and Techniques》 SCIE EI CAS CSCD 2024年第3期216-227,共12页
The distribution of the nuclear ground-state spin in a two-body random ensemble(TBRE)was studied using a general classification neural network(NN)model with two-body interaction matrix elements as input features and t... The distribution of the nuclear ground-state spin in a two-body random ensemble(TBRE)was studied using a general classification neural network(NN)model with two-body interaction matrix elements as input features and the corresponding ground-state spins as labels or output predictions.The quantum many-body system problem exceeds the capability of our optimized NNs in terms of accurately predicting the ground-state spin of each sample within the TBRE.However,our NN model effectively captured the statistical properties of the ground-state spin because it learned the empirical regularity of the ground-state spin distribution in TBRE,as discovered by physicists. 展开更多
关键词 Neural network Two-body random ensemble Spin distribution of nuclear ground state
下载PDF
Improving Thyroid Disorder Diagnosis via Ensemble Stacking and Bidirectional Feature Selection
9
作者 Muhammad Armghan Latif Zohaib Mushtaq +6 位作者 Saad Arif Sara Rehman Muhammad Farrukh Qureshi Nagwan Abdel Samee Maali Alabdulhafith Yeong Hyeon Gu Mohammed A.Al-masni 《Computers, Materials & Continua》 SCIE EI 2024年第3期4225-4241,共17页
Thyroid disorders represent a significant global health challenge with hypothyroidism and hyperthyroidism as two common conditions arising from dysfunction in the thyroid gland.Accurate and timely diagnosis of these d... Thyroid disorders represent a significant global health challenge with hypothyroidism and hyperthyroidism as two common conditions arising from dysfunction in the thyroid gland.Accurate and timely diagnosis of these disorders is crucial for effective treatment and patient care.This research introduces a comprehensive approach to improve the accuracy of thyroid disorder diagnosis through the integration of ensemble stacking and advanced feature selection techniques.Sequential forward feature selection,sequential backward feature elimination,and bidirectional feature elimination are investigated in this study.In ensemble learning,random forest,adaptive boosting,and bagging classifiers are employed.The effectiveness of these techniques is evaluated using two different datasets obtained from the University of California Irvine-Machine Learning Repository,both of which undergo preprocessing steps,including outlier removal,addressing missing data,data cleansing,and feature reduction.Extensive experimentation demonstrates the remarkable success of proposed ensemble stacking and bidirectional feature elimination achieving 100%and 99.86%accuracy in identifying hyperthyroidism and hypothyroidism,respectively.Beyond enhancing detection accuracy,the ensemble stacking model also demonstrated a streamlined computational complexity which is pivotal for practical medical applications.It significantly outperformed existing studies with similar objectives underscoring the viability and effectiveness of the proposed scheme.This research offers an innovative perspective and sets the platform for improved thyroid disorder diagnosis with broader implications for healthcare and patient well-being. 展开更多
关键词 ensemble learning random forests BOOSTING dimensionality reduction machine learning smart healthcare computer aided diagnosis
下载PDF
Improving Channel Estimation in a NOMA Modulation Environment Based on Ensemble Learning
10
作者 Lassaad K.Smirani Leila Jamel Latifah Almuqren 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第8期1315-1337,共23页
This study presents a layered generalization ensemble model for next generation radio mobiles,focusing on supervised channel estimation approaches.Channel estimation typically involves the insertion of pilot symbols w... This study presents a layered generalization ensemble model for next generation radio mobiles,focusing on supervised channel estimation approaches.Channel estimation typically involves the insertion of pilot symbols with a well-balanced rhythm and suitable layout.The model,called Stacked Generalization for Channel Estimation(SGCE),aims to enhance channel estimation performance by eliminating pilot insertion and improving throughput.The SGCE model incorporates six machine learning methods:random forest(RF),gradient boosting machine(GB),light gradient boosting machine(LGBM),support vector regression(SVR),extremely randomized tree(ERT),and extreme gradient boosting(XGB).By generating meta-data from five models(RF,GB,LGBM,SVR,and ERT),we ensure accurate channel coefficient predictions using the XGB model.To validate themodeling performance,we employ the leave-one-out cross-validation(LOOCV)approach,where each observation serves as the validation set while the remaining observations act as the training set.SGCE performances’results demonstrate higher mean andmedian accuracy compared to the separatedmodel.SGCE achieves an average accuracy of 98.4%,precision of 98.1%,and the highest F1-score of 98.5%,accurately predicting channel coefficients.Furthermore,our proposedmethod outperforms prior traditional and intelligent techniques in terms of throughput and bit error rate.SGCE’s superior performance highlights its efficacy in optimizing channel estimation.It can effectively predict channel coefficients and contribute to enhancing the overall efficiency of radio mobile systems.Through extensive experimentation and evaluation,we demonstrate that SGCE improved performance in channel estimation,surpassing previous techniques.Accordingly,SGCE’s capabilities have significant implications for optimizing channel estimation in modern communication systems. 展开更多
关键词 Stacked generalization ensemble learning Non-Orthogonal Multiple Access(NOMA) channel estimation 5G
下载PDF
Growth and Interactions of Multi-Source Perturbations in Convection-Allowing Ensemble Forecasts
11
作者 张璐 闵锦忠 +2 位作者 庄潇然 王世璋 魏莉青 《Journal of Tropical Meteorology》 SCIE 2024年第2期118-131,共14页
This study investigated the growth of forecast errors stemming from initial conditions(ICs),lateral boundary conditions(LBCs),and model(MO)perturbations,as well as their interactions,by conducting seven 36 h convectio... This study investigated the growth of forecast errors stemming from initial conditions(ICs),lateral boundary conditions(LBCs),and model(MO)perturbations,as well as their interactions,by conducting seven 36 h convectionallowing ensemble forecast(CAEF)experiments.Two cases,one with strong-forcing(SF)and the other with weak-forcing(WF),occurred over the Yangtze-Huai River basin(YHRB)in East China,were selected to examine the sources of uncertainties associated with perturbation growth under varying forcing backgrounds and the influence of these backgrounds on growth.The perturbations exhibited distinct characteristics in terms of temporal evolution,spatial propagation,and vertical distribution under different forcing backgrounds,indicating a dependence between perturbation growth and forcing background.A comparison of the perturbation growth in different precipitation areas revealed that IC and LBC perturbations were significantly influenced by the location of precipitation in the SF case,while MO perturbations were more responsive to convection triggering and dominated in the WF case.The vertical distribution of perturbations showed that the sources of uncertainties and the performance of perturbations varied between SF and WF cases,with LBC perturbations displaying notable case dependence.Furthermore,the interactions between perturbations were considered by exploring the added values of different source perturbations.For the SF case,the added values of IC,LBC,and MO perturbations were reflected in different forecast periods and different source uncertainties,suggesting that the combination of multi-source perturbations can yield positive interactions.In the WF case,MO perturbations provided a more accurate estimation of uncertainties downstream of the Dabie Mountain and need to be prioritized in the research on perturbation development. 展开更多
关键词 convection-allowing ensemble forecast forcing background perturbation growth INTERACTIONS added value
下载PDF
A Novel Hybrid Ensemble Learning Approach for Enhancing Accuracy and Sustainability in Wind Power Forecasting
12
作者 Farhan Ullah Xuexia Zhang +2 位作者 Mansoor Khan Muhammad Abid Abdullah Mohamed 《Computers, Materials & Continua》 SCIE EI 2024年第5期3373-3395,共23页
Accurate wind power forecasting is critical for system integration and stability as renewable energy reliance grows.Traditional approaches frequently struggle with complex data and non-linear connections. This article... Accurate wind power forecasting is critical for system integration and stability as renewable energy reliance grows.Traditional approaches frequently struggle with complex data and non-linear connections. This article presentsa novel approach for hybrid ensemble learning that is based on rigorous requirements engineering concepts.The approach finds significant parameters influencing forecasting accuracy by evaluating real-time Modern-EraRetrospective Analysis for Research and Applications (MERRA2) data from several European Wind farms usingin-depth stakeholder research and requirements elicitation. Ensemble learning is used to develop a robust model,while a temporal convolutional network handles time-series complexities and data gaps. The ensemble-temporalneural network is enhanced by providing different input parameters including training layers, hidden and dropoutlayers along with activation and loss functions. The proposed framework is further analyzed by comparing stateof-the-art forecasting models in terms of Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE),respectively. The energy efficiency performance indicators showed that the proposed model demonstrates errorreduction percentages of approximately 16.67%, 28.57%, and 81.92% for MAE, and 38.46%, 17.65%, and 90.78%for RMSE for MERRAWind farms 1, 2, and 3, respectively, compared to other existingmethods. These quantitativeresults show the effectiveness of our proposed model with MAE values ranging from 0.0010 to 0.0156 and RMSEvalues ranging from 0.0014 to 0.0174. This work highlights the effectiveness of requirements engineering in windpower forecasting, leading to enhanced forecast accuracy and grid stability, ultimately paving the way for moresustainable energy solutions. 展开更多
关键词 ensemble learning machine learning real-time data analysis stakeholder analysis temporal convolutional network wind power forecasting
下载PDF
An Enhanced Ensemble-Based Long Short-Term Memory Approach for Traffic Volume Prediction
13
作者 Duy Quang Tran Huy Q.Tran Minh Van Nguyen 《Computers, Materials & Continua》 SCIE EI 2024年第3期3585-3602,共18页
With the advancement of artificial intelligence,traffic forecasting is gaining more and more interest in optimizing route planning and enhancing service quality.Traffic volume is an influential parameter for planning ... With the advancement of artificial intelligence,traffic forecasting is gaining more and more interest in optimizing route planning and enhancing service quality.Traffic volume is an influential parameter for planning and operating traffic structures.This study proposed an improved ensemble-based deep learning method to solve traffic volume prediction problems.A set of optimal hyperparameters is also applied for the suggested approach to improve the performance of the learning process.The fusion of these methodologies aims to harness ensemble empirical mode decomposition’s capacity to discern complex traffic patterns and long short-term memory’s proficiency in learning temporal relationships.Firstly,a dataset for automatic vehicle identification is obtained and utilized in the preprocessing stage of the ensemble empirical mode decomposition model.The second aspect involves predicting traffic volume using the long short-term memory algorithm.Next,the study employs a trial-and-error approach to select a set of optimal hyperparameters,including the lookback window,the number of neurons in the hidden layers,and the gradient descent optimization.Finally,the fusion of the obtained results leads to a final traffic volume prediction.The experimental results show that the proposed method outperforms other benchmarks regarding various evaluation measures,including mean absolute error,root mean squared error,mean absolute percentage error,and R-squared.The achieved R-squared value reaches an impressive 98%,while the other evaluation indices surpass the competing.These findings highlight the accuracy of traffic pattern prediction.Consequently,this offers promising prospects for enhancing transportation management systems and urban infrastructure planning. 展开更多
关键词 ensemble empirical mode decomposition traffic volume prediction long short-term memory optimal hyperparameters deep learning
下载PDF
Deep-Ensemble Learning Method for Solar Resource Assessment of Complex Terrain Landscapes
14
作者 Lifeng Li Zaimin Yang +3 位作者 Xiongping Yang Jiaming Li Qianyufan Zhou Ping Yang 《Energy Engineering》 EI 2024年第5期1329-1346,共18页
As the global demand for renewable energy grows,solar energy is gaining attention as a clean,sustainable energy source.Accurate assessment of solar energy resources is crucial for the siting and design of photovoltaic... As the global demand for renewable energy grows,solar energy is gaining attention as a clean,sustainable energy source.Accurate assessment of solar energy resources is crucial for the siting and design of photovoltaic power plants.This study proposes an integrated deep learning-based photovoltaic resource assessment method.Ensemble learning and deep learning methods are fused for photovoltaic resource assessment for the first time.The proposed method combines the random forest,gated recurrent unit,and long short-term memory to effectively improve the accuracy and reliability of photovoltaic resource assessment.The proposed method has strong adaptability and high accuracy even in the photovoltaic resource assessment of complex terrain and landscape.The experimental results show that the proposed method outperforms the comparison algorithm in all evaluation indexes,indicating that the proposed method has higher accuracy and reliability in photovoltaic resource assessment with improved generalization performance traditional single algorithm. 展开更多
关键词 Photovoltaic resource assessment deep learning ensemble learning random forest gated recurrent unit long short-term memory
下载PDF
ABMRF:An Ensemble Model for Author Profiling Based on Stylistic Features Using Roman Urdu
15
作者 Aiman Muhammad Arshad +3 位作者 Bilal Khan Khalil Khan Ali Mustafa Qamar Rehan Ullah Khan 《Intelligent Automation & Soft Computing》 2024年第2期301-317,共17页
This study explores the area of Author Profiling(AP)and its importance in several industries,including forensics,security,marketing,and education.A key component of AP is the extraction of useful information from text... This study explores the area of Author Profiling(AP)and its importance in several industries,including forensics,security,marketing,and education.A key component of AP is the extraction of useful information from text,with an emphasis on the writers’ages and genders.To improve the accuracy of AP tasks,the study develops an ensemble model dubbed ABMRF that combines AdaBoostM1(ABM1)and Random Forest(RF).The work uses an extensive technique that involves textmessage dataset pretreatment,model training,and assessment.To evaluate the effectiveness of several machine learning(ML)algorithms in classifying age and gender,including Composite Hypercube on Random Projection(CHIRP),Decision Trees(J48),Na飗e Bayes(NB),K Nearest Neighbor,AdaboostM1,NB-Updatable,RF,andABMRF,they are compared.The findings demonstrate thatABMRFregularly beats the competition,with a gender classification accuracy of 71.14%and an age classification accuracy of 54.29%,respectively.Additional metrics like precision,recall,F-measure,Matthews Correlation Coefficient(MCC),and accuracy support ABMRF’s outstanding performance in age and gender profiling tasks.This study demonstrates the usefulness of ABMRF as an ensemble model for author profiling and highlights its possible uses in marketing,law enforcement,and education.The results emphasize the effectiveness of ensemble approaches in enhancing author profiling task accuracy,particularly when it comes to age and gender identification. 展开更多
关键词 Machine learning author profiling AdaBoostM1 random forest ensemble learning text classification
下载PDF
Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
16
作者 R.Sujatha K.Nimala 《Computers, Materials & Continua》 SCIE EI 2024年第2期1669-1686,共18页
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir... Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88. 展开更多
关键词 Bidirectional encoder for representation of transformer conversation ensemble model fine-tuning generalized autoregressive pretraining for language understanding generative pre-trained transformer hyperparameter tuning natural language processing robustly optimized BERT pretraining approach sentence classification transformer models
下载PDF
基于改进Self-paced Ensemble算法的浏览器指纹识别
17
作者 张德升 陈博 +3 位作者 张建辉 卜佑军 孙重鑫 孙嘉 《计算机科学》 CSCD 北大核心 2023年第7期317-324,共8页
浏览器指纹技术凭借其无状态、跨域一致等优点,已经被许多网站应用到用户追踪、广告投放和安全验证等方面。浏览器指纹识别的过程是典型的不平衡数据的分类过程。针对当前浏览器指纹长期追踪过程中存在数据样本类不平衡导致指纹识别准... 浏览器指纹技术凭借其无状态、跨域一致等优点,已经被许多网站应用到用户追踪、广告投放和安全验证等方面。浏览器指纹识别的过程是典型的不平衡数据的分类过程。针对当前浏览器指纹长期追踪过程中存在数据样本类不平衡导致指纹识别准确度低、长期追踪易失效等问题,提出了改进的Self-paced Ensemble(Improved SPE,ISPE)方法应用于浏览器指纹识别。对浏览器指纹样本欠采样过程和集成学习单个分类器的训练过程进行了改进,重点针对难以识别的浏览器指纹,添加类注意力机制并优化自协调因子,使分类器在训练和识别浏览器指纹的过程中更加注重边界样本的分类效果,从而提升总体的浏览器指纹识别准确度。在所收集的3 483条指纹和开源数据集中的15 000条指纹上进行了实验,结果表明,ISPE算法在浏览器指纹匹配识别的F1-score达到95.6%,相比Bi-RNN算法提高了16.8%。 展开更多
关键词 浏览器指纹 用户追踪 Self-paced ensemble 欠采样 集成学习
下载PDF
On the Influences of Urbanization on the Extreme Rainfall over Zhengzhou on 20 July 2021: A Convection-Permitting Ensemble Modeling Study 被引量:5
18
作者 Yali LUO Jiahua ZHANG +5 位作者 Miao YU Xudong LIANG Rudi XIA Yanyu GAO Xiaoyu GAO Jinfang YIN 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2023年第3期393-409,共17页
This study investigates the influences of urban land cover on the extreme rainfall event over the Zhengzhou city in central China on 20 July 2021 using the Weather Research and Forecasting model at a convection-permit... This study investigates the influences of urban land cover on the extreme rainfall event over the Zhengzhou city in central China on 20 July 2021 using the Weather Research and Forecasting model at a convection-permitting scale[1-km resolution in the innermost domain(d3)].Two ensembles of simulation(CTRL,NURB),each consisting of 11 members with a multi-layer urban canopy model and various combinations of physics schemes,were conducted using different land cover scenarios:(i)the real urban land cover,(ii)all cities in d3 being replaced with natural land cover.The results suggest that CTRL reasonably reproduces the spatiotemporal evolution of rainstorms and the 24-h rainfall accumulation over the key region,although the maximum hourly rainfall is underestimated and displaced to the west or southwest by most members.The ensemble mean 24-h rainfall accumulation over the key region of heavy rainfall is reduced by 13%,and the maximum hourly rainfall simulated by each member is reduced by 15–70 mm in CTRL relative to NURB.The reduction in the simulated rainfall by urbanization is closely associated with numerous cities/towns to the south,southeast,and east of Zhengzhou.Their heating effects jointly lead to formation of anomalous upward motions in and above the planetary boundary layer(PBL),which exaggerates the PBL drying effect due to reduced evapotranspiration and also enhances the wind stilling effect due to increased surface friction in urban areas.As a result,the lateral inflows of moisture and high-θe(equivalent potential temperature)air from south and east to Zhengzhou are reduced. 展开更多
关键词 URBANIZATION extreme rainfall convection-permitting ensemble simulation land-atmosphere interaction boundary layer water vapor transport
下载PDF
Ensemble learning prediction of soybean yields in China based on meteorological data 被引量:1
19
作者 LI Qian-chuan XU Shi-wei +3 位作者 ZHUANG Jia-yu LIU Jia-jia ZHOU Yi ZHANG Ze-xi 《Journal of Integrative Agriculture》 SCIE CAS CSCD 2023年第6期1909-1927,共19页
The accurate prediction of soybean yield is of great significance for agricultural production, monitoring and early warning.Although previous studies have used machine learning algorithms to predict soybean yield base... The accurate prediction of soybean yield is of great significance for agricultural production, monitoring and early warning.Although previous studies have used machine learning algorithms to predict soybean yield based on meteorological data,it is not clear how different models can be used to effectively separate soybean meteorological yield from soybean yield in various regions. In addition, comprehensively integrating the advantages of various machine learning algorithms to improve the prediction accuracy through ensemble learning algorithms has not been studied in depth. This study used and analyzed various daily meteorological data and soybean yield data from 173 county-level administrative regions and meteorological stations in two principal soybean planting areas in China(Northeast China and the Huang–Huai region), covering 34 years.Three effective machine learning algorithms(K-nearest neighbor, random forest, and support vector regression) were adopted as the base-models to establish a high-precision and highly-reliable soybean meteorological yield prediction model based on the stacking ensemble learning framework. The model's generalizability was further improved through 5-fold crossvalidation, and the model was optimized by principal component analysis and hyperparametric optimization. The accuracy of the model was evaluated by using the five-year sliding prediction and four regression indicators of the 173 counties, which showed that the stacking model has higher accuracy and stronger robustness. The 5-year sliding estimations of soybean yield based on the stacking model in 173 counties showed that the prediction effect can reflect the spatiotemporal distribution of soybean yield in detail, and the mean absolute percentage error(MAPE) was less than 5%. The stacking prediction model of soybean meteorological yield provides a new approach for accurately predicting soybean yield. 展开更多
关键词 meteorological factors ensemble learning crop yield prediction machine learning county-level
下载PDF
Ensemble Bayesian method for parameter distribution inference:application to reactor physics 被引量:1
20
作者 Jia‑Qin Zeng Hai‑Xiang Zhang +1 位作者 He‑Lin Gong Ying‑Ting Luo 《Nuclear Science and Techniques》 SCIE EI CAS CSCD 2023年第12期216-228,共13页
The estimation of model parameters is an important subject in engineering.In this area of work,the prevailing approach is to estimate or calculate these as deterministic parameters.In this study,we consider the model ... The estimation of model parameters is an important subject in engineering.In this area of work,the prevailing approach is to estimate or calculate these as deterministic parameters.In this study,we consider the model parameters from the perspective of random variables and describe the general form of the parameter distribution inference problem.Under this framework,we propose an ensemble Bayesian method by introducing Bayesian inference and the Markov chain Monte Carlo(MCMC)method.Experiments on a finite cylindrical reactor and a 2D IAEA benchmark problem show that the proposed method converges quickly and can estimate parameters effectively,even for several correlated parameters simultaneously.Our experiments include cases of engineering software calls,demonstrating that the method can be applied to engineering,such as nuclear reactor engineering. 展开更多
关键词 Model parameters Bayesian inference Frequency distribution ensemble Bayesian method KL divergence
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部