期刊文献+

基于预测的Spark动态资源分配策略

Prediction-based Dynamic Resource Allocation Strategy for Spark Platform
下载PDF
导出
摘要 分布式内存计算平台Spark是海量数据处理领域的最新技术进展。动态资源分配下Spark可根据应用的负载情况动态地追增、关闭任务执行器。然而,关闭任务执行器会造成缓存数据丢失,导致不必要的重计算开销,该情况在Spark交互式数据查询应用中尤为常见。为尽量减少任务执行器关闭以提升查询效率,设计实现一种基于预测的Spark动态资源分配策略。该策略基于马尔科夫理论构建Spark交互式数据查询应用的非活跃期持续时间预测模型,并依据预测结果确定任务执行器的关闭时机。试验结果表明,相比既有的Spark动态资源分配策略,采用基于预测的资源分配策略可使Spark交互式数据查询效率平均提升59.34%。 The distributed in-memory computing framework Spark is the latest technological advancement in the field of massive data processing.Under dynamic resource allocation,Spark can dynamically increase and close executors according to the workload of the application.However,removing executors would result in the loss of cached data and lead to unnecessary recomputing cost.This situation is particularly common in Spark interactive data query applications.Therefore,it is necessary to minimize the closing of the executors to improve the query efficiency.This paper designs and implements a prediction-based dynamic resource allocation strategy for Spark platform.This strategy constructs a non-active duration prediction model of Spark interactive data query application based on Markov theory,and determines the closing time of executors according to the prediction result.The experimental results show that compared with Spark’s dynamic resource allocation strategy,the efficiency of Spark’s interactive data query can be improved by59.34%.
作者 梁毅 程石帆 常世禄 刘飞 LIANG Yi;CHENG Shi-fan;CHANG Shi-lu;LIU Fei(Computer Academy,Beijing University of Technology,Beijing 100124,China)
出处 《软件导刊》 2018年第12期43-47,共5页 Software Guide
关键词 分布式计算平台 SPARK 大数据处理技术 动态资源分配 数据查询 distributed comuting platform Spark big data processing technology dynamic resource allocation data query
  • 相关文献

参考文献2

二级参考文献10

共引文献372

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部