摘要
作为新一代的大数据计算引擎,Flink得到了广泛应用。Flink在云环境下进行容器化部署时,其默认任务调度算法不能感知节点的资源信息,导致即时调整负载和自主均衡能力较差,而主流的容器编排工具虽然提供了管理容器的可能性,却也未能结合Flink特点解决平衡资源利用的同时降低容器组内的通信开销问题。针对以上问题开展研究,提出了一种面向云环境的Flink负载均衡策略FLBS,综合考虑了Flink集群中算子的分布特点和容器间通信机制,以节点间通信开销和均衡负载作为评估标准。实验结果表明,与Flink默认调度策略相比,FLBS能够有效提高计算效率,提升系统性能。
As a new generation of big data computing engine,Flink has been widely used.When containers of Flink are deployed in cloud environment,its default task scheduling algorithm cannot perceive node resources information and adjust the load in time,and the capacity for independent equilibrium is poorer.Although mainstream container layout tools provide the possibility of container management,they fails to combine Flink characteristics to solve the problem of balancing the resource utilization while reducing the communication overhead in the container group.Aiming at the above problem,this paper proposes a Flink load balancing strategy for cloud environment,which comprehensively considers the distribution characteristics of operators in Flink cluster and the communication mechanism between containers,and takes the communication cost between nodes and load balancing as evaluation criteria.Experimental results show that,compared with Flink default scheduling algorithm,this algorithm can effectively improve the computing efficiency and system performance.
作者
徐浩桐
黄山
孙国璋
贺菲莉
段晓东
XU Hao-tong;HUANG Shan;SUN Guo-zhang;HE Fei-li;DUAN Xiao-dong(College of Computer Science and Engineering,Dalian Minzu University,Dalian 116600;State Ethnic Affairs Commission Key Laboratory of Big Data Applied Technology(Dalian Minzu University),Dalian 116600;Dalian Key Laboratory of Digital Technology for National Culture(Dalian Minzu University),Dalian 116600,China)
出处
《计算机工程与科学》
CSCD
北大核心
2022年第5期779-787,共9页
Computer Engineering & Science
基金
国家重点研发计划(2018YFB1004402)。
关键词
Flink
容器
通信开销
负载均衡
迁移
Flink
container
communication overhead
load balancing
migration