期刊文献+

基于全球典型油气田数据库的数据挖掘预处理 被引量:9

PREPROCESSING OF THE DATA TAPPING BASED ON GLOBAL TYPICAL OIL AND GAS FIELD DATABASE
下载PDF
导出
摘要 石油工业早已进入大数据时代,数据挖掘是充分利用数据资产价值的有效途径,而数据预处理是数据挖掘研究的热点之一。分析了数据挖掘以及数据预处理的意义及其现状,提出了在石油工业进行数据挖掘的基本思路;以某国际石油勘探开发技术服务与咨询公司研制的全球典型油气田数据库为例,以"采收率"为挖掘对象,详细解析了各种常用的数据挖掘预处理方法和具体做法,主要包括数据获取、属性选择、数据清理、数据集成、数据变换、数据规约和数据消密;提出了源数据的"5C"标准,即Correctness(正确性)、Currency(适时性)、Completeness(完整性)、Consistency(一致性)、Confidentiality(保密性)。研究成果可为石油行业开展数据预处理等工作提供参考。 Oil industry has entered upon "big data" epoch for many years, the data tapping or mining is an effec- tive method to fully utilize the value of the data asset, and the data preprocessing is one of the study focuses of the data mining. The significance and situation of the data mining and preprocessing are analyzed, the basic thinking of the data mining in oil industry was presented. Taking Global Typical Oil and Gas Field database from an interna- tional petroleum exploration and development service and consultant company as the example, the detailed methods of the data mining preprocessing are dissected by taking "recovery factor" as the mining object. These methods in- clude: data acquisition, attribute selection, data cleaning, data integration, data conversion, data specification and data confidentiality treatment; finally "5C" criteria for the source data are proposed: correctness, currency, com- pleteness, consistency and confidentiality. These achievements can provide references for the researchers on the da- ta preprocessing and so on in oil industry.
出处 《大庆石油地质与开发》 CAS CSCD 北大核心 2016年第1期66-70,共5页 Petroleum Geology & Oilfield Development in Daqing
基金 国家油气重大科技专项"全球剩余油气资源研究及油气资产快速评价技术"(2011ZX050)
关键词 数据挖掘 预处理 油气田 数据库 5C标准 data tapping/mining preprocessing oil and gas field database 5C Criteria
  • 相关文献

参考文献4

二级参考文献26

共引文献10

同被引文献77

引证文献9

二级引证文献53

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部