摘要
按照数字中国“2522”的整体框架要求,需持续夯实数据资源体系,充分发挥数据作为新生产要素的关键作用,在新一代数字科技的支撑和引领下,以数据为关键要素,以价值释放为核心,以数据赋能为主线,推动数字化转型,通过“上云用数赋智”行动,实现高质量的发展。随着企业业务的发展,企业数据数据量不断膨胀,在当前环境下,流批数据复用场景越来越多。传统架构使用数据仓库来处理批计算场景,使用流平台处理流计算场景。相对传统架构,数据湖可以一份存储、批流两用,实现近实时的处理能力,直接进行高效的数据分析。文中提出了一种基于数据湖的数据处理方案,在解决流式处理痛点的同时,实时批流一体处理模式。相对于传统的数据处理模式,流式数据处理有着更高的处理效率和成本控制能力。
According to the overall framework requirements of Digital Chinas“2522”,it is necessary to continuously consolidate the data resource system,give full play to the key role of data as a new production factor,and under the support and guidance of the new generation of digital technology,take data as the key element,release value as the core,take data empower as the main line,promote digital transformation,and achieve high-quality development through the“cloud with data intelligence”action.With the development of enterprise business,the amount of enterprise data data continues to expand.In the current environment,there are more and more reuse scenarios for streaming batch data.Traditional architectures use data warehouses to handle batch computing scenarios,and use streaming platforms to handle streaming computing scenarios.Compared with traditional architectures,data lakes can be used for both storage and batch streaming,achieving near-real-time processing capabilities and directly performing efficient data analytics.This paper proposes a data processing solution based on data lakes,which not only solves the pain points of streaming processing,but also integrates real-time batch and stream processing.Compared with traditional data processing models,streaming data processing has higher processing efficiency and cost control capabilities.
作者
叶浩
张承琪
张亚威
申小军
杨孟新
YE Hao;ZHANG Chenqi;ZHANG Yawei;SHEN Xiaojun;YANG Mengxin(China United Network Communications Corporation Limited,Jinan 250100,China)
出处
《移动信息》
2024年第4期298-300,315,共4页
MOBILE INFORMATION
关键词
湖仓一体
Flink
计算引擎
Integrated lake and warehouse
Flink
Computing engine