期刊文献+

水力压裂缺失数据填补方法研究比较 被引量:4

Research and comparison of filling methods for missing data in hydraulic fracturing
下载PDF
导出
摘要 缺失值填补是机器学习算法数据预处理中不可或缺的任务。以苏东压裂直井为例,通过筛选收集了800口井的地质、压裂施工和生产三个方面的数据集。应用机器学习数据预处理方法在R语言中对缺失值可视化描述。并绘制无阻流量与各自变量之间的散点矩阵图,分析线性关系较为显著的自变量。通过构造完整数据集,对线性关系较好的自变量建立多元线性回归模型,分析该回归模型的各个参数及其标准误差并作为参考标准。分别采用均值填补法、K最近邻填补法和多重填补对缺失值进行填补,将填补的数据集与完整数据集的参数及其标准误差进行比较后,发现就此数据集缺失机制和缺失率而言,选择多重填补法效果最佳。 Missing value filling is an indispensable task in data preprocessing of machine learning algorithms.Taking Sudong fracturing vertical wells as an example,we collected data sets on three aspects of geology,fracturing construction and production of 800 wells through text screening.Applying machine learning data preprocessing methods to visually describe missing values in R language.And draw a scatter matrix diagram between the unblocked flow and the respective variables,and analyze the independent variables with a significant linear relationship.By constructing a complete data set,a multiple linear regression model is established for independent variables with good linear relationships,and each parameter of the regression model and its standard errors are analyzed and used as a reference standard.Mean imputation method,K nearest neighbor imputation method and multiple imputation were used to fill missing values.After comparing the parameters and standard errors of the filled data set with the complete data set,it was learned that the multiple filling method had the best effect in terms of the missing data set and the missing rate.
作者 樊毅龙 马先林 连建文 FAN Yilong;MA Xianlin;LIAN Jianwen(School of Petroleum Engineering,Xi'an Shiyou University,Xi'an Shaanxi 710065,China;Shaanxi Provincial Key Laboratory of Special Production Increasing Technology for Oil and Gas Fields,Xi'an Shaanxi 710065,China;Chengdu University of Technology,Chengdu Sichuan 610059,China)
出处 《石油化工应用》 CAS 2020年第11期48-55,共8页 Petrochemical Industry Application
基金 陕西省教育厅重点实验室科研计划项目,项目编号:18JS085 西安石油大学创新与实践培养项目基金资助,项目编号:YCS19123031。
关键词 缺失值 R语言 多重填补 missing value R language multiple imputation
  • 相关文献

参考文献8

二级参考文献85

共引文献289

同被引文献27

引证文献4

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部