摘要
流式计算是大数据的一种重要计算模式,大数据流式计算已成为研究热点。任务管理是大数据流式计算的核心功能之一,负责对流式计算的任务进行资源调度及全生命周期管理。目前对于大数据流式计算的技术调研工作主要集中于流式计算应用需求、体系结构及整体技术,缺乏对大数据流式计算任务管理技术的精细化调研分析。首先给出流式计算任务管理的抽象功能模型,其次基于该模型对任务管理的关键技术进行了分类和综述,最后对既有主流的大数据流式计算系统对上述关键技术的应用、集成和优化进行了调研分析。
Stream computing is an important part of big data computing, which has become a hot topic in big data research. Task management is one of the essential features of stream computing, and is responsible for resource scheduling and lifecycle management of stream computing tasks. Current researches focus on application requirements, architecture and overall technology of stream computing, and they are lack of dedicated investigation and analysis of task management techniques. Firstly, we present a general abstract function model of task management for stream computing systems. Secondly, we classify and analyze the key techniques for task management based on this model. Finally, we investigate their applications in current stream processing systems, and the integration and optimization of above techniques.
出处
《计算机工程与科学》
CSCD
北大核心
2017年第2期215-226,共12页
Computer Engineering & Science
基金
国家自然科学基金(61202075
91546111)
北京市自然科学基金(4133081)
关键词
大数据流式计算
任务管理
抽象功能模型
资源分配
数据分发
容错
big data stream computing
task management
abstract function model
resource allocation
data distribution
fault tolerance