期刊文献+

基于改进集成学习的测井岩性识别方法研究 被引量:4

Logging lithology identification method based on improved ensemble learning
下载PDF
导出
摘要 测井数据中存在大量与岩性无关的冗余信息,且各类岩性标签数据分布不均匀,严重影响岩性识别准确率,现有测井岩性识别算法无法有效解决岩性类间不平衡问题。为此提出了一种针对不平衡样本集的集成学习岩性预测方法KSMOSEL:首先以录井岩性数据为岩性样本标签,将测井曲线作为模型输入;然后将K-means算法与合成少数类过采样技术(SMOTE)相结合形成K-means-合成过采样算法,即KS采样算法,对岩性样本集进行平衡化处理;最后将采样后的数据集用于构建集成学习模型并训练,采用多个分类器模型融合构成强学习器,通过“软投票”方式预测岩性类型。以Hugoton油气田测井岩性数据为基础,采用改进不平衡样本集的集成学习岩性预测方法对岩性进行分类,并将识别效果与传统的分类模型:支持向量机、K最近邻分类、决策树、XGBoost和随机森林等模型进行对比。试验结果表明:KSMOSEL方法具有更高的精度,岩性识别准确率达到94.28%;KS采样之后,支持向量机、K最近邻分类、决策树、XGBoost、随机森林、GBDT和集成学习等模型岩性识别准确率分别提高了18.68%,12.03%,3.77%,10.23%,24.77%,16.69%,19.37%,在测井岩性数据分布比例不平衡时极大地提升了岩性识别的准确率。 Logging data contains a lot of redundant information that is irrelevant to lithology,and the distribution of various lithology label data is uneven,which substantially impacts the accuracy of lithology recognition.The commonly used classification algorithms cannot effectively solve the problem of imbalance between lithology classes.Therefore,for unbalanced sample sets,a k-means Synthetic Minority Over Sampling Ensemble Learning(KSMOSEL)lithology prediction method is suggested.Firstly,logging lithology data were used as lithology sample labels and logging data are used as lithology sample features in this study.Secondly,the k-means algorithm was combined with Synthetic Minority Over-sampling Technique(SMOTE)to form a k-means-synthesized oversampling(KS)algorithm,to balance the lithology sample set.Then,the sampled data sets were used to build and train the integrated learning model.Multiple classifier models were fused to form a strong learner.The new training data were modeled and the"soft voting"method was used to predict the lithology types.Finally,based on the logging lithology data from the Hugoton oil and gas field,the lithology identification method of over-sampling integrated learning with an improved unbalanced sample set was adopted to classify lithology,and the identification effect was compared with the traditional classification models:Support vector machine(SVM),k-nearest neighbor classification(KNN),Decision Tree,XGBoost,and random forest models.The experimental results revealed that KSMOSEL method had the highest accuracy,with a lithology identification accuracy of 94.28%.The accuracy of lithologic identification of SVM,KNN,Decision Tree,XGBoost,random forest,GBDT and integrated learning models increased by 18.68%,12.03%,3.77%,10.23%,24.77%,16.69%,and 19.37%,respectively.It can be promoted as a lithology identification technique that can greatly improve the accuracy of lithology identification with an unbalanced distribution ratio of logging lithology data.
作者 罗仁泽 庹娟娟 倪华玲 李兴宇 雷璨如 郭亮 LUO Renze;TUO Juanjuan;NI Hualing;LI Xingyu;LEI Canru;GUO Liang(State Key Laboratory of Oil and Gas Reservoir Geology and Exploitation,Southwest Petroleum University,School of Earth Science and Technology,Chengdu 610500,China;Southwest Geophysical Exploration Bureau of Geophysical Prospecting,China National Petroleum Corporation,Chengdu 610500,China)
出处 《石油物探》 CSCD 北大核心 2023年第2期212-224,共13页 Geophysical Prospecting For Petroleum
基金 国家重点研发计划深地专项项目(2016YFC0601100) 四川省科技项目(2019CXRC0027)共同资助。
关键词 岩性识别 非平衡数据 过采样 KSMOSEL 测井数据 lithology identification unbalanced data oversampling KSMOSEL logging data
  • 相关文献

参考文献12

二级参考文献168

共引文献215

同被引文献71

引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部