期刊文献+
共找到45篇文章
< 1 2 3 >
每页显示 20 50 100
A systematic machine learning method for reservoir identification and production prediction 被引量:3
1
作者 Wei Liu Zhangxin Chen +1 位作者 Yuan Hu Liuyang Xu 《Petroleum Science》 SCIE EI CAS CSCD 2023年第1期295-308,共14页
Reservoir identification and production prediction are two of the most important tasks in petroleum exploration and development.Machine learning(ML)methods are used for petroleum-related studies,but have not been appl... Reservoir identification and production prediction are two of the most important tasks in petroleum exploration and development.Machine learning(ML)methods are used for petroleum-related studies,but have not been applied to reservoir identification and production prediction based on reservoir identification.Production forecasting studies are typically based on overall reservoir thickness and lack accuracy when reservoirs contain a water or dry layer without oil production.In this paper,a systematic ML method was developed using classification models for reservoir identification,and regression models for production prediction.The production models are based on the reservoir identification results.To realize the reservoir identification,seven optimized ML methods were used:four typical single ML methods and three ensemble ML methods.These methods classify the reservoir into five types of layers:water,dry and three levels of oil(I oil layer,II oil layer,III oil layer).The validation and test results of these seven optimized ML methods suggest the three ensemble methods perform better than the four single ML methods in reservoir identification.The XGBoost produced the model with the highest accuracy;up to 99%.The effective thickness of I and II oil layers determined during the reservoir identification was fed into the models for predicting production.Effective thickness considers the distribution of the water and the oil resulting in a more reasonable production prediction compared to predictions based on the overall reservoir thickness.To validate the superiority of the ML methods,reference models using overall reservoir thickness were built for comparison.The models based on effective thickness outperformed the reference models in every evaluation metric.The prediction accuracy of the ML models using effective thickness were 10%higher than that of reference model.Without the personal error or data distortion existing in traditional methods,this novel system realizes rapid analysis of data while reducing the time required to resolve reservoir classification and production prediction challenges.The ML models using the effective thickness obtained from reservoir identification were more accurate when predicting oil production compared to previous studies which use overall reservoir thickness. 展开更多
关键词 Reservoir identification Production prediction Machine learning ensemble method
下载PDF
Using Hybrid and Diversity-Based Adaptive Ensemble Method for Binary Classification
2
作者 Xing Fan Chung-Horng Lung Samuel A. Ajila 《International Journal of Intelligence Science》 2018年第3期43-74,共32页
This paper proposes an adaptive and diverse hybrid-based ensemble method to improve the performance of binary classification. The proposed method is a non-linear combination of base models and the application of adapt... This paper proposes an adaptive and diverse hybrid-based ensemble method to improve the performance of binary classification. The proposed method is a non-linear combination of base models and the application of adaptive selection of the most suitable model for each data instance. Ensemble method, an important machine learning technique uses multiple single models to construct a hybrid model. A hybrid model generally performs better compared to a single individual model. In a given dataset the application of diverse single models trained with different machine learning algorithms will have different capabilities in recognizing patterns in the given training sample. The proposed approach has been validated on Repeat Buyers Prediction dataset and Census Income Prediction dataset. The experiment results indicate up to 18.5% improvement on F1 score for the Repeat Buyers dataset compared to the best individual model. This improvement also indicates that the proposed ensemble method has an exceptional ability of dealing with imbalanced datasets. In addition, the proposed method outperforms two other commonly used ensemble methods (Averaging and Stacking) in terms of improved F1 score. Finally, our results produced a slightly higher AUC score of 0.718 compared to the previous result of AUC score of 0.712 in the Repeat Buyers competition. This roughly 1% increase AUC score in performance is significant considering a very big dataset such as Repeat Buyers. 展开更多
关键词 Bigdata ANALYTICS MACHINE learning ADAPTIVE ensemble methods BINARY classification
下载PDF
Time-sensitive prediction of NO_(2) concentration in China using an ensemble machine learning model from multi-source data 被引量:2
3
作者 Chenliang Tao Man Jia +5 位作者 Guoqiang Wang Yuqiang Zhang Qingzhu Zhang Xianfeng Wang Qiao Wang Wenxing Wang 《Journal of Environmental Sciences》 SCIE EI CAS CSCD 2024年第3期30-40,共11页
Nitrogen dioxide(NO_(2))poses a critical potential risk to environmental quality and public health.A reliable machine learning(ML)forecasting framework will be useful to provide valuable information to support governm... Nitrogen dioxide(NO_(2))poses a critical potential risk to environmental quality and public health.A reliable machine learning(ML)forecasting framework will be useful to provide valuable information to support government decision-making.Based on the data from1609 air quality monitors across China from 2014-2020,this study designed an ensemble ML model by integrating multiple types of spatial-temporal variables and three sub-models for time-sensitive prediction over a wide range.The ensemble ML model incorporates a residual connection to the gated recurrent unit(GRU)network and adopts the advantage of Transformer,extreme gradient boosting(XGBoost)and GRU with residual connection network,resulting in a 4.1%±1.0%lower root mean square error over XGBoost for the test results.The ensemble model shows great prediction performance,with coefficient of determination of 0.91,0.86,and 0.77 for 1-hr,3-hr,and 24-hr averages for the test results,respectively.In particular,this model has achieved excellent performance with low spatial uncertainty in Central,East,and North China,the major site-dense zones.Through the interpretability analysis based on the Shapley value for different temporal resolutions,we found that the contribution of atmospheric chemical processes is more important for hourly predictions compared with the daily scale predictions,while the impact of meteorological conditions would be ever-prominent for the latter.Compared with existing models for different spatiotemporal scales,the present model can be implemented at any air quality monitoring station across China to facilitate achieving rapid and dependable forecast of NO_(2),which will help developing effective control policies. 展开更多
关键词 Air quality prediction Deep learning ensemble method Nitrogen dioxide Spatiotemporal covariates
原文传递
Strip flatness prediction of cold rolling based on ensemble methods 被引量:1
4
作者 Wu-quan Yang Zhi-ting Zhao +2 位作者 Liang-yu Zhu Xun-yang Gao Li Wang 《Journal of Iron and Steel Research International》 SCIE EI CAS CSCD 2024年第1期237-251,共15页
Aiming at the problem of insufficient prediction accuracy of strip flatness at the outlet of cold tandem rolling,the prediction performance of strip flatness based on different ensemble methods was studied and a high-... Aiming at the problem of insufficient prediction accuracy of strip flatness at the outlet of cold tandem rolling,the prediction performance of strip flatness based on different ensemble methods was studied and a high-precision prediction ensemble model of strip flatness at the outlet was established.Firstly,based on linear regression(LR),K nearest neighbors(KNN),support vector regression,regression trees(RT),and backpropagation neural network(BPN),bagging,boosting,and stacking ensemble methods were used for ensemble experiments.Secondly,three existing ensemble models,i.e.,random forest,extreme random tree(ET)and extreme gradient boosting,were used to conduct experiments and compare the results.The research shows that bagging,boosting,and stacking three ensemble methods have the most significant improvement in the prediction accuracy of the regression trees model,which is increased by 5.28%,6.51%,and 5.32%,respectively.At the same time,the stacking ensemble method improves both the simple model and the complex model,and the improvement effect on the simple base model is the greatest,which is 4.69%higher than that of the base model KNN.Comparing all of the ensemble models,the stacking ensemble model of level-1(ET,AdaBoost-RT,LR,BPN)paired with level-2(LR)was discovered to be the best model(EALB-LR)and can be further studied for industrial applications. 展开更多
关键词 Tandem cold rolling Flatness prediction Machine learning ensemble method
原文传递
Ensemble Nonlinear Support Vector Machine Approach for Predicting Chronic Kidney Diseases
5
作者 S.Prakash P.Vishnu Raja +3 位作者 A.Baseera D.Mansoor Hussain V.R.Balaji K.Venkatachalam 《Computer Systems Science & Engineering》 SCIE EI 2022年第9期1273-1287,共15页
Urban living in large modern cities exerts considerable adverse effectson health and thus increases the risk of contracting several chronic kidney diseases (CKD). The prediction of CKDs has become a major task in urb... Urban living in large modern cities exerts considerable adverse effectson health and thus increases the risk of contracting several chronic kidney diseases (CKD). The prediction of CKDs has become a major task in urbanizedcountries. The primary objective of this work is to introduce and develop predictive analytics for predicting CKDs. However, prediction of huge samples isbecoming increasingly difficult. Meanwhile, MapReduce provides a feasible framework for programming predictive algorithms with map and reduce functions.The relatively simple programming interface helps solve problems in the scalability and efficiency of predictive learning algorithms. In the proposed work, theiterative weighted map reduce framework is introduced for the effective management of large dataset samples. A binary classification problem is formulated usingensemble nonlinear support vector machines and random forests. Thus, instead ofusing the normal linear combination of kernel activations, the proposed work creates nonlinear combinations of kernel activations in prototype examples. Furthermore, different descriptors are combined in an ensemble of deep support vectormachines, where the product rule is used to combine probability estimates ofdifferent classifiers. Performance is evaluated in terms of the prediction accuracyand interpretability of the model and the results. 展开更多
关键词 Chronic disease classification iterative weighted map reduce machine learning methods ensemble nonlinear support vector machines random forests
下载PDF
A New Hybrid Machine Learning Model for Short-Term Climate Prediction by Performing Classification Prediction and Regression Prediction Simultaneously
6
作者 Deqian LI Shujuan HU +4 位作者 Jinyuan GUO Kai WANG Chenbin GAO Siyi WANG Wenping HE 《Journal of Meteorological Research》 SCIE CSCD 2022年第6期853-865,共13页
Machine learning methods are effective tools for improving short-term climate prediction.However,commonly used methods often carry out classification and regression prediction modeling separately and independently.Suc... Machine learning methods are effective tools for improving short-term climate prediction.However,commonly used methods often carry out classification and regression prediction modeling separately and independently.Such a single modeling approach may obtain inconsistent prediction results in classification and regression and thus may not meet the needs of practical applications well.To address this issue,this study proposes a selective Naive Bayes ensemble model(SENB-EM)by introducing causal effect and voting strategy on Naive Bayes.The new model can not only screen effective predictors but also perform classification and regression prediction simultaneously.After being applied to the area prediction of summer western North Pacific subtropical high(WNPSH)from 2008 to 2021,it is found that the accuracy classification score(a metric to assess the overall classification prediction accuracy)and the time correlation coefficient(TCC)of SENB-EM can reach 1.0 and 0.81,respectively.After integrating the results of different models[including multiple linear regression ensemble model(MLR-EM),SENB-EM,and Chinese Multimodel Ensemble Prediction System(CMME)used by National Climate Center(NCC)]for 2017-2021,the TCC of the ensemble results of SENB-EM and CMME can reach 0.92(the highest result among them).This indicates that the prediction results of the summer WNPSH area provided by SENB-EM have a high reference value for the real-time prediction.It is worth noting that,except for the numerical prediction results,the SENB-EM model can also give the range of numerical prediction intervals and predictions for anomalous degrees of the WNPSH area,thus providing more reference information for meteorological forecasters.Overall,as a new hybrid machine learning model,the SENB-EM has a good prediction ability;the approach of performing classification prediction and regression prediction simultaneously through integration is informative to short-term climate prediction. 展开更多
关键词 selective Naive Bayes ensemble model machine learning short-term climate prediction classification prediction regression prediction western North Pacific subtropical high
原文传递
FORECASTING CHINA'S FOREIGN TRADE VOLUME WITH A KERNEL-BASED HYBRID ECONOMETRIC-AI ENSEMBLE LEARNING APPROACH 被引量:5
7
作者 Lean YU Shouyang WANG Kin Keung LAI 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2008年第1期1-19,共19页
Due to the complexity of economic system and the interactive effects between all kinds of economic variables and foreign trade, it is not easy to predict foreign trade volume. However, the difficulty in predicting for... Due to the complexity of economic system and the interactive effects between all kinds of economic variables and foreign trade, it is not easy to predict foreign trade volume. However, the difficulty in predicting foreign trade volume is usually attributed to the limitation of many conventional forecasting models. To improve the prediction performance, the study proposes a novel kernel-based ensemble learning approach hybridizing econometric models and artificial intelligence (AI) models to predict China's foreign trade volume. In the proposed approach, an important econometric model, the co-integration-based error correction vector auto-regression (EC-VAR) model is first used to capture the impacts of all kinds of economic variables on Chinese foreign trade from a multivariate linear anal- ysis perspective. Then an artificial neural network (ANN) based EC-VAR model is used to capture the nonlinear effects of economic variables on foreign trade from the nonlinear viewpoint. Subsequently, for incorporating the effects of irregular events on foreign trade, the text mining and expert's judgmental adjustments are also integrated into the nonlinear ANN-based EC-VAR model. Finally, all kinds of economic variables, the outputs of linear and nonlinear EC-VAR models and judgmental adjustment model are used as input variables of a typical kernel-based support vector regression (SVR) for en- semble prediction purpose. For illustration, the proposed kernel-based ensemble learning methodology hybridizing econometric techniques and AI methods is applied to China's foreign trade volume predic- tion problem. Experimental results reveal that the hybrid econometric-AI ensemble learning approach can significantly improve the prediction performance over other linear and nonlinear models listed in this study. 展开更多
关键词 Artificial neural networks error-correction vector auto-regression foreign trade prediction hybrid ensemble learning kernel-based method support vector regression.
原文传递
一种专利共类与深度学习模型结合的技术融合预测方法研究 被引量:5
8
作者 祝娜 尹俊华 翟羽佳 《情报理论与实践》 CSSCI 北大核心 2024年第1期145-153,共9页
[目的/意义]作为科技创新的重要手段,技术融合预测对于改进技术研发的策略选择具有重要参考和借鉴意义,文章提出一种专利共类与深度学习模型结合的技术融合预测方法,以提高预测结果的准确性和可靠性。[方法/过程]以燃料电池技术为例,首... [目的/意义]作为科技创新的重要手段,技术融合预测对于改进技术研发的策略选择具有重要参考和借鉴意义,文章提出一种专利共类与深度学习模型结合的技术融合预测方法,以提高预测结果的准确性和可靠性。[方法/过程]以燃料电池技术为例,首先采用关联规则挖掘算法识别专利数据中具有强关联的IPC频繁项集,计算技术相对相似度,基于AP聚类算法进行技术聚类;然后运用生成式拓扑映射算法识别其中技术融合点,构建训练数据集和测试数据集。最后基于深度学习模型进行学习训练,预测燃料电池技术未来可能出现的技术融合。[结果/结论]这种方法在准确率和召回率上表现优异,可以快速、客观地识别技术融合,为技术创新的智能决策和预测提供支持和帮助。 展开更多
关键词 技术融合 深度学习 专利共类 预测方法
下载PDF
不平衡工艺参数数据集的高温透平叶片铸件质量预测方法
9
作者 朱铜 艾松 +1 位作者 陈琨 高建民 《西安交通大学学报》 EI CAS CSCD 北大核心 2024年第9期94-104,共11页
针对熔模精密铸造工艺参数数据集射线检测(RT)结果存在合格与不合格数量严重不平衡问题,提出一种基于概率分布的合成少数类集成学习(SyMProD-Stacking)的铸件质量预测方法。该方法首先对原始数据集进行预处理以保证数据质量,然后利用Z... 针对熔模精密铸造工艺参数数据集射线检测(RT)结果存在合格与不合格数量严重不平衡问题,提出一种基于概率分布的合成少数类集成学习(SyMProD-Stacking)的铸件质量预测方法。该方法首先对原始数据集进行预处理以保证数据质量,然后利用Z分数去除噪声数据,为每个少数类实例(不合格铸件)分配一个概率并基于此概率分布生成样本数据以获取平衡数据集,利用极端梯度提升模型(XGBoost)对所有工艺参数特征进行重要性排序并剔除部分排名靠后的工艺参数,最后将轻量级梯度提升机(LightGBM)、随机森林(RF)、支持向量机(SVM)和XGBoost模型进行Stacking集成并利用平衡数据集构建质量预测模型。以高温透平叶片制造过程精铸工艺为例,对所提出的质量预测方法进行验证,结果表明:相比于原始数据集构建的预测模型,利用了SyMProD过采样方法构建的预测模型不合格铸件的预测准确率提升了75.4%;相比于单一算法模型,所提质量预测方法的曲线下面积(A_(AUCROC))、几何均值(G_(m))以及F_(1)分数(F_(1))这3项性能指标分别提升了5.48%~11.59%、3.78%~8.92%、5.72%~11.39%,所提出的方法能够很好地预测高温透平叶片精铸过程在不平衡问题下的铸件质量。 展开更多
关键词 高温透平叶片 不平衡问题 过采样方法 集成学习 质量预测
下载PDF
基于嵌入法与集成学习的线路工程造价预测
10
作者 叶煜明 钱琪琪 +1 位作者 万正东 张继钢 《中国电力》 CSCD 北大核心 2024年第5期251-260,共10页
架空线路工程造价的准确预测对于工程建设质量及造价管控具有十分重要的意义。针对传统架空线路工程造价预测中遇到的特征维度过高、单一预测模型难以拟合复杂造价数据等问题,提出了基于嵌入法数据降维与集成学习的线路工程造价预测算... 架空线路工程造价的准确预测对于工程建设质量及造价管控具有十分重要的意义。针对传统架空线路工程造价预测中遇到的特征维度过高、单一预测模型难以拟合复杂造价数据等问题,提出了基于嵌入法数据降维与集成学习的线路工程造价预测算法。首先通过嵌入法与极端梯度提升(extreme gradient boosting,XGBoost)模型对特征进行排序,筛选出对造价影响显著的特征完成数据降维。然后对XGBoost、随机森林、支持向量机(support vector machine,SVM)等模型进行融合,构成双层集成学习模型并对线路工程造价进行预测。最后基于某电网公司近年线路工程造价数据进行实例分析,分别与XGBoost、随机森林、SVM、极限学习机(extreme learning machine,ELM)与反向传播(back propagation,BP)神经网络等模型进行对比。实验表明预测结果的平均绝对百分比误差低于4%,优于其他单一模型,对线路工程造价控制研究具有较大价值。 展开更多
关键词 输电线路 造价预测 集成学习 数据降维 嵌入法
下载PDF
基于集成学习的蔗渣灰混凝土抗压强度预测模型
11
作者 林星 梁诗雪 冯斯奕 《浙江理工大学学报(自然科学版)》 2024年第4期507-517,共11页
为了得到高效准确的基于集成学习的蔗渣灰混凝土抗压强度预测模型,建立了4种集成学习模型,即eXtreme Gradient-Boosting(XGBoost)、Random Forest(RF)、Light Gradient-Boosting Machine(LightGBM)和Adaptive Boosting(AdaBoost);通过... 为了得到高效准确的基于集成学习的蔗渣灰混凝土抗压强度预测模型,建立了4种集成学习模型,即eXtreme Gradient-Boosting(XGBoost)、Random Forest(RF)、Light Gradient-Boosting Machine(LightGBM)和Adaptive Boosting(AdaBoost);通过模型的性能比较得到了预测能力最优的集成学习模型,然后利用SHAP(Shapley additive explanation)值方法定量研究各输入变量对蔗渣灰混凝土抗压强度的影响。首先,进行蔗渣灰混凝土抗压强度试验,根据试验数据和文献数据构建了包含水泥含量、水灰比、蔗渣灰掺和量、细骨料含量、粗骨料含量等5个输入变量的集成学习数据库。然后,采用决定系数、平均绝对误差、均方根误差、可靠性指数等4个评估指标来评估模型的预测能力。通过对比发现:XGBoost模型的预测精度最高,该模型训练集的评估指标决定系数、平均绝对误差、均方根误差、可靠性指数分别为0.976、1.811、2.344、0.875。各输入变量对蔗渣灰混凝土抗压强度的影响从大到小排序为水泥含量、细骨料含量、粗骨料含量、蔗渣灰掺和量、水灰比;水泥含量对混凝土抗压强度有正面影响,蔗渣灰掺和量低于10%时不会明显降低混凝土的抗压强度。该研究为蔗渣灰混凝土抗压强度的预测和影响因素解释提供了有益参考,对于推动蔗渣灰混凝土等环保型材料的研究和应用具有一定价值。 展开更多
关键词 集成学习 蔗渣灰混凝土 抗压强度 SHAP值方法 预测模型
下载PDF
数据挖掘与组合学习 被引量:18
12
作者 刁力力 胡可云 +1 位作者 陆玉昌 石纯一 《计算机科学》 CSCD 北大核心 2001年第7期73-78,共6页
Data-mining is a kind of solution for solving the problem of information exploding. Classification and prediction belong to the most fundamental tasks in data-mining field. Many experiments have showed that the result... Data-mining is a kind of solution for solving the problem of information exploding. Classification and prediction belong to the most fundamental tasks in data-mining field. Many experiments have showed that the results of ensemble of learning methods are generally better than those of single learning methods under most of the time. In the sense,it is of great value to introduce ensemble of learning methods to data mining. This paper introduces data mining and ensemble of learning methods respectively,along with the analysis and formulation about the role ensemble of learning methods can act in some important practicing aspects of data mining:Text mining,multi-media information mining and web mining. 展开更多
关键词 数据挖掘 数据库 知识发现 组合学习
下载PDF
用于图分类的组合维核方法 被引量:7
13
作者 李宇峰 郭天佑 周志华 《计算机学报》 EI CSCD 北大核心 2009年第5期946-952,共7页
对图等内含结构信息的数据进行学习,是机器学习领域的一个重要问题.核方法是解决此类问题的一种有效技术.文中针对分子图分类问题,基于Swamidass等人的工作,提出用于图分类的组合维核方法.该方法首先构建融合一维信息的二维核来刻画分... 对图等内含结构信息的数据进行学习,是机器学习领域的一个重要问题.核方法是解决此类问题的一种有效技术.文中针对分子图分类问题,基于Swamidass等人的工作,提出用于图分类的组合维核方法.该方法首先构建融合一维信息的二维核来刻画分子化学特征,然后基于分子力学的相关知识,利用几何信息构建三维核来刻画分子物理性质.在此基础上对不同维度的核进行集成,通过求解二次约束二次规划问题来获得最优核组合.实验结果表明,文中方法比现有技术具有更好的性能. 展开更多
关键词 机器学习 图分类 核方法 结构信息 集成学习
下载PDF
中文情绪识别方法研究 被引量:5
14
作者 刘欢欢 李寿山 +1 位作者 周国栋 李逸薇 《江西师范大学学报(自然科学版)》 CAS 北大核心 2013年第2期120-124,共5页
以中文情绪语料库(Ren-CECps)为基础,重点研究了句子级情绪识别方法.比较了不同特征以及不同机器学习分类方法(NB,SVM,ME)对情绪识别的影响.此外,针对情绪文本和非情绪文本在语料中的分布非常不平衡问题,通过集成学习的算法来实现不平... 以中文情绪语料库(Ren-CECps)为基础,重点研究了句子级情绪识别方法.比较了不同特征以及不同机器学习分类方法(NB,SVM,ME)对情绪识别的影响.此外,针对情绪文本和非情绪文本在语料中的分布非常不平衡问题,通过集成学习的算法来实现不平衡情绪识别,用以提高情绪识别的整体性能.实验结果表明:使用基于样本的集成学习方法能够有效解决不平衡问题,明显提高情绪识别的分类性能. 展开更多
关键词 情绪识别 特征工程 分类方法 不平衡分类 集成学习
下载PDF
基于浮动阈值分类器组合的多标签分类算法 被引量:9
15
作者 张丹普 付忠良 +1 位作者 王莉莉 李昕 《计算机应用》 CSCD 北大核心 2015年第1期147-151,共5页
针对目标可以同时属于多个类别的多标签分类问题,提出了一种基于浮动阈值分类器组合的多标签分类算法。首先,分析探讨了基于浮动阈值分类器的Ada Boost算法(Ada Boost.FT)的原理及错误率估计,证明了该算法能克服固定分段阈值分类器对分... 针对目标可以同时属于多个类别的多标签分类问题,提出了一种基于浮动阈值分类器组合的多标签分类算法。首先,分析探讨了基于浮动阈值分类器的Ada Boost算法(Ada Boost.FT)的原理及错误率估计,证明了该算法能克服固定分段阈值分类器对分类边界附近点分类不稳定的缺点从而提高分类准确率;然后,采用二分类(BR)方法将该单标签学习算法应用于多标签分类问题,得到基于浮动阈值分类器组合的多标签分类方法,即多标签Ada Boost.FT。实验结果表明,所提算法的平均分类精度在Emotions数据集上比Ada Boost.MH、ML-k NN、Rank SVM这3种算法分别提高约4%、8%、11%;在Scene、Yeast数据集上仅比Rank SVM低约3%、1%。由实验分析可知,在不同类别标记之间基本没有关联关系或标签数目较少的数据集上,该算法均能得到较好的分类效果。 展开更多
关键词 连续ADABOOST 浮动阈值 极大似然原理 多标签分类 集成学习 二分类方法
下载PDF
面向软件缺陷预测的聚类欠采样集成方法 被引量:3
16
作者 陆鹏程 邱建林 +2 位作者 卞彩峰 陈璐璐 陈翔 《计算机工程与设计》 北大核心 2016年第7期1805-1810,1891,共7页
为缓解类不平衡问题对预测模型性能的影响,提出一种基于聚类的欠采样集成方法 CBUE(cluster-based undersampling ensemble method)。对多数类进行聚类分析,根据聚类的结果分布(即每个簇的大小比例)有放回地选择N个多数类的子集,N个子... 为缓解类不平衡问题对预测模型性能的影响,提出一种基于聚类的欠采样集成方法 CBUE(cluster-based undersampling ensemble method)。对多数类进行聚类分析,根据聚类的结果分布(即每个簇的大小比例)有放回地选择N个多数类的子集,N个子集分别和所有的少数类实例组成N个新的训练集;根据N个训练集训练出N个分类器,按照少数服从多数的原则生成一个新的集成分类器对新的数据进行预测。CBUE以NASA数据集作为评测对象,以balance、G-mean和AUC为评测指标,实验结果表明,该方法在大部分情况下要优于5种经典的基准方法 (ROS、RUS、SMOTE、RF和NB)。 展开更多
关键词 类不平衡学习 软件缺陷预测 集成学习方法 欠采样 聚类
下载PDF
选择性集成学习模型在岩性-孔隙度预测中的应用 被引量:8
17
作者 段友祥 王言飞 孙歧峰 《科学技术与工程》 北大核心 2020年第3期1001-1008,共8页
储层是油藏地质建模的主要对象,储层属性参数的预测是建模的重要基础和主要难点之一。利用机器学习方法建立预测模型是目前研究的一个热点。针对单一机器学习方法在孔隙度预测方面存在的容错率低、过拟合等缺点,提出了融合岩性分类进行... 储层是油藏地质建模的主要对象,储层属性参数的预测是建模的重要基础和主要难点之一。利用机器学习方法建立预测模型是目前研究的一个热点。针对单一机器学习方法在孔隙度预测方面存在的容错率低、过拟合等缺点,提出了融合岩性分类进行选择性集成学习建立预测模型的方法。该方法首先使用支持向量机进行岩性分类,并将岩性分类结果作为孔隙度选择性集成预测模型的输入。然后在研究分析典型机器学习方法的基础上,通过主成分方法分析法从支持向量回归、径向基(radial basis function,RBF)神经网络、随机森林、岭回归和K近邻回归等经典模型中选择出一组表现优异的个体学习模型组成集成学习模型,个体在集成模型中的权重由“主成分权重平均”法获得,最终采用加权平均法得到集成学习模型的输出。该方法考虑了岩性对孔隙度的影响,克服了单一模型存在的不足,模型的泛化能力强。研究结果表明,该方法的预测精度明显优于其他单一机器学习方法,适应性好。 展开更多
关键词 孔隙度预测 岩性分类 选择性集成学习 机器学习 主成分方法分析法 主成分权重平均法
下载PDF
基于决策准则优化的不均衡数据分类 被引量:2
18
作者 曹鹏 栗伟 赵大哲 《小型微型计算机系统》 CSCD 北大核心 2014年第5期961-966,共6页
现实世界中广泛存在着类别分布不均衡的数据,而传统分类算法在数据失衡的情况下分类效果很不理想,为此提出一种基于决策准则优化的组合分类算法.该算法基于朴素贝叶斯模型输出的后验概率,以不均衡数据评价指标作为目标函数,对决策阈值(... 现实世界中广泛存在着类别分布不均衡的数据,而传统分类算法在数据失衡的情况下分类效果很不理想,为此提出一种基于决策准则优化的组合分类算法.该算法基于朴素贝叶斯模型输出的后验概率,以不均衡数据评价指标作为目标函数,对决策阈值(二类)或错分代价参数(多类)进行优化,得到最佳的分类决策准则;同时为了提高分类的泛化性,提出一种自适应随机子空间组合分类算法,增强基分类器之间的差异性,避免分类器学习和决策准则优化的过拟合,并可自动获得基分类器的最佳数量.通过大量UCI数据集的实验验证表明,与其它同类算法相比,该算法在精度和效率上都具有更好的处理不均衡数据的优势. 展开更多
关键词 不均衡数据分类 代价敏感学习 组合分类 随机子空间
下载PDF
基于集成迁移学习的细粒度图像分类算法 被引量:17
19
作者 吴建 许镜 丁韬 《重庆邮电大学学报(自然科学版)》 CSCD 北大核心 2020年第3期452-458,共7页
针对现有的大部分细粒度图像分类算法都忽略了局部定位和局部特征学习是相互关联的问题,提出了一种基于集成迁移学习的细粒度图像分类算法。该算法的分类网络由区域检测分类和多尺度特征组合组成。区域检测分类网络通过类别激活映射(cla... 针对现有的大部分细粒度图像分类算法都忽略了局部定位和局部特征学习是相互关联的问题,提出了一种基于集成迁移学习的细粒度图像分类算法。该算法的分类网络由区域检测分类和多尺度特征组合组成。区域检测分类网络通过类别激活映射(class activation mapping,CAM)方法获得局部区域,以相互强化学习的方式,从定位的局部区域中学习图像的细微特征,组合各局部区域特征作为最终的特征表示进行分类。该细粒度图像分类网络在训练过程中结合提出的集成迁移学习方法,基于迁移学习,通过随机加权平均方法集成局部训练模型,从而获得更好的最终分类模型。使用该算法在数据集CUB-200-2011和Stanford Cars上进行实验,结果表明,与原有大部分算法对比,该算法具有更优的细粒度分类结果。 展开更多
关键词 细粒度图像分类 集成迁移学习 类别激活映射 随机加权平均
下载PDF
基于混合模型的中长期降水量预测 被引量:2
20
作者 李栋 薛惠锋 《计算机科学》 CSCD 北大核心 2018年第9期271-278,287,共9页
针对中长期降水量预测精度较低的问题,提出了由改进集合经验模态分解方法、最小二乘法、核极限学习机和改进的果蝇优化算法构成的混合模型来对区域年度降水量序列进行预测。首先,通过改进集合经验模态分解方法将非平稳降水量时间序列分... 针对中长期降水量预测精度较低的问题,提出了由改进集合经验模态分解方法、最小二乘法、核极限学习机和改进的果蝇优化算法构成的混合模型来对区域年度降水量序列进行预测。首先,通过改进集合经验模态分解方法将非平稳降水量时间序列分解为多个分解项。然后,根据不同分解项的特性分别采用最小二乘法和核极限学习机对其进行预测。由于核极限学习机均存在一定的参数敏感特性,因此提出使用改进的果蝇优化算法来对核极限学习机的相关参数搜索寻优,以提高其预测精度。最后,将各分解项的预测结果叠加,从而形成最终预测结果。以广东省7个地市1951-2015年的年度降水量为例,对所提方法进行了验证,结果表明:相比于自回归移动平均模型和核极限学习机模型,混合模型预测具有更高的预测精度。 展开更多
关键词 预测 混合模型 改进集合经验模态分解方法 最小二乘法 核极限学习机 改进果蝇优化算法
下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部