摘要
利用大数据计算平台对大量的静态数据进行数据挖掘和智能分析助推了大数据和人工智能应用的落地。在面临互联网、物联网产生的日益庞大的实时动态数据的处理需求时,数据流计算被逐步引入目前的一些大数据处理平台中。针对数据流的编程模型,比较了传统软件工程的面向数据流的分析和设计方法与目前针对大数据处理平台的数据流编程模型提供的结构定义和模型参考,分析了两者的差异和不足,总结了数据流编程模型的主要特征和关键要素。分析了目前数据流编程的主要方式以及与主流编程工具的结合,针对大数据处理的数据流计算业务需求,给出了可视化数据流编程工具的基本框架和编程模式。
The application of big data and artificial intelligence is promoted by data mining and intelligent analysis of a large number of static data using big data computing platform.In the face of the growing demand for real-time dynamic data processing generated by the Internet of things,dataflow computing has been gradually introduced into some big data processing platforms.Aiming at the programming model of data flow,the traditional software engineering design method for dataflow analysis and the structure definition and model reference provided by the current dataflow programming model for big data processing platform was compared,the differences and shortcomings were analyzed,and the main features and key elements of the dataflow programming model were summarized.The main methods of dataflow programming and the combination with the mainstream programming tools were analyzed,and the basic framework and programming mode of visual dataflow programming tools were presented according to the dataflow computing business requirements of big data processing.
作者
邹骁锋
阳王东
容学成
李肯立
李克勤
ZOU Xiaofeng;YANG Wangdong;RONG Xuecheng;LI Kenli;LI Keqin(College of Computer Science and Electronic Engineering,Hunan University,Changsha 410008,China)
出处
《大数据》
2020年第3期57-72,共16页
Big Data Research
基金
国家重点研发计划基金资助项目(No.2018YFB1003401)。
关键词
数据流
编程模型
大数据处理
编程工具
data flow
programming model
big data processing
programming tool