摘要
随着地震勘探技术的进步,特别是高密度地震技术的发展,地震数据的体量越来越大.如何高效的处理海量地震数据成为了一个不可避免的问题.一个有效可行的解决方案是采用分布式并行处理技术对地震处理算法进行并行化改造,提升效率.目前使用较多的分布式并行技术包括MPI、MapReduce等.但是对于地表一致性处理算法,由于其具有多阶段、迭代和全局处理的特点,基于现有并行技术的并行改造难度巨大,处理大数据的效率难以有效提升.为此,本文引入了支持内存迭代计算的并行框架Spark,并以地表一致性剩余静校正方法为例,探讨了地表一致性处理技术的Spark并行化方法.主要思路是将算法执行流程转化为数据流转过程,应用Spark提供的强大的并行操作算子,实现算法的并行.对于流转过程中数据的平滑、匹配等操作进行了优化处理,提升了大数据支撑能力和效率.应用海量数据进行的测试证明,该方法可以支撑TB级海量数据的处理,并且具有很高的处理效率,可应用于高密度地震资料处理实际生产.
With the advancement of seismic exploration technology, especially the development of high-density seismic technology, the volume of seismic data is getting larger and larger. How to efficiently process large-scale seismic data becomes an inevitable problem. An effective and feasible solution is to parallelize seismic processing algorithms by mean of distributed parallel processing technique to improve efficiency. At present, distributed parallel technologies, which are usually used, include MPI, MapReduce, etc. Surface-consistent processing algorithms,such as surface-consistent deconvolution, surface-consistent amplitude compensation, and surface-consistent residual static correction, play important roles in seismic processing. However, based on present distributed parallel technologies, surface-consistent processing algorithms are difficult to be parallelized and to be made efficiency effectively improved when handling big data, as they have characters of global processing, multi-stages processing and iterative processing. For this purpose, in this paper the parallel frame Spark, which supports iterative computing in memory, is introduced, and by taking surface-consistent residual static correction algorithm as an example, parallelization method based on Spark for surface-consistent processing technology is discussed. The main thought for implementation is to convert the process of algorithm to transformation of data, and then to parallelize the execution by applying powerful parallel operators provided by Spark. Optimizations are made for data smoothing and data matching in the process of data flow to improve the ability of supporting and efficiency for big data. Tests with massive data for surface-consistent residual static correction Spark algorithm are done and the processing speed exceeds 33000 trace per second, or 30 GB per minute. These tests prove that, this method can support processing for terabytes of big data, with considerable high efficiency, which can be applied in practical production of high-density seismic data processing. Spark provides a new solution for the challenges of seismic processing of huge data and has great application potential in future seismic exploration.
作者
廉西猛
隋志强
丛龙水
张睿璇
LIAN Xi-meng;SUI Zhi-qiang;CONG Long-shui;ZHANG Rui-xuan(Geophysical Research Institute of Shengli Oilfield Branch Company,Sinopec,Dongying 257022,China)
出处
《地球物理学进展》
CSCD
北大核心
2020年第6期2367-2372,共6页
Progress in Geophysics
基金
国家科技重大专项(2016ZX05006-002)
中石化项目(PE19006-3)联合资助。