期刊文献+

一种Spark轻量级客户端实现方法研究 被引量:1

Research on a Spark Light-Weight Client Implementation Method
下载PDF
导出
摘要 【目的】满足前端用户频繁交互需求,克服传统重客户端与Spark应用服务保持长连接会话的弊端。【方法】在边缘节点服务器上部署高性能负载均衡和动态代理组件(HAProxy),提供一种通过轻量级客户端提交Spark作业的实现方法,对Spark作业进行动态调度与全生命周期管理。【结果】通过Spark on YARN模式将多个具有相同功能、相互之间能独立运行的Rest服务部署到YARN集群上,利用HAProxy的自动重载机制进行动态更新和加载后端服务配置,使前端用户在对后端变动无感知的情况下,通过HAProxy统一对外接口,将Spark作业提交到分散运行在Yarn集群上无差别的Rest服务中执行。【结论】该方法无须保持边缘节点服务器与集群节点服务器之间的长连接会话,通过HAProxy能有效避免外部用户直接访问集群内部节点,实现集群内外安全隔离的目的,同时可在Spark on YARN运行模式下实现Spark作业的交互式提交与异步调度,完成对Spark作业全生命周期的自主控制。 [Purposes]To meet the frequent interaction needs of front-end users and overcome the draw-backs of traditional heavy client and Spark application service to maintain long connection sessions.[Methods]A high-performance load balancing and dynamic proxy component(HAProxy)was deployed on the edge node server to provide an implementation method for submitting Spark jobs through lightweight cli-ents,and to dynamically schedule and manage the full life cycle of Spark jobs.[Findings]Through the Spark on YARN mode,multiple Rest services that are with the same function and can run independently with each other are deployed to the YARN cluster.The automatic overload mechanism of HAProxy is used to dynami-cally update and load the back-end service configuration,so that the front-end users can submit the Spark job to the undifferentiated Rest service running on the Yarn cluster through the HAProxy unified external in-terface under the condition of no perception of back-end changes.[Conclusions]This method does not need to maintain a long connection session between the edge node server and the cluster node server.Through HAProxy,it can effectively avoid external users from directly accessing the internal nodes of the cluster,and achieve the purpose of security isolation inside and outside the cluster.At the same time,it can realize the in-teractive submission and asynchronous scheduling of Spark jobs in Spark on YARN operation mode,and com-plete the autonomous control of the whole life cycle of Spark jobs.
作者 张凤 卢居辉 朱海勇 吴文 ZHANG Feng;LU Juhui;ZHU Haiyong;WU Wen(Qiankun Big Data Operating System Research Institute,Xiamen Meiya Pico Information Co.,Ltd.,Xiamen 361001,China)
出处 《河南科技》 2023年第15期19-24,共6页 Henan Science and Technology
关键词 HAProxy SPARK YARN 动态配置 Haproxy Spark YARN dynamic allocation
  • 相关文献

参考文献6

二级参考文献69

共引文献43

同被引文献5

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部