期刊文献+

一种基于深度强化学习的Spark Streaming参数优化方法 被引量:1

A Spark Streaming Parameter Optimization Method Based on Deep Reinforcement Learning
下载PDF
导出
摘要 Spark Streaming作为主流的开源分布式流分析框架,性能优化是目前的研究热点之一。在Spark Streaming性能优化中,业务场景下的配置参数优化是其性能提升的重要因素。在Spark Streaming系统中,可配置的参数有200多个,对参数调优人员的经验要求较高,未经优化的参数配置会影响流作业执行性能。因此,针对Spark Streaming的参数配置优化问题,提出一种基于深度强化学习的Spark Streaming参数优化方法(DQN-SSPO),将Spark Streaming参数优化配置问题转化为深度强化学习模型训练中的最大回报获得问题,并提出权重状态空间转移方法来增加模型训练获得高反馈奖励的概率。在3种典型的流分析任务上进行实验,结果表明经参数优化后Spark Streaming上的流作业性能在总调度时间上平均缩减27.93%,在总处理时间上平均缩减42%。 Spark Streaming is the mainstream open source distributed stream analysis framework,and its performance optimization is one of the current research hotspots.In Spark Streaming performance optimization,configuration parameter optimization in business scenarios is an important factor in its performance improvement.In the Spark Streaming system,there are more than 200 configurable parameters,which requires high experience for parameter tuning personnel.Non optimized parameter configuration will affect the execution performance of streaming jobs.Therefore,in view of the parameter configuration optimization problem of Spark Streaming,a Spark Streaming parameter optimization method based on deep reinforcement learning(DQN-SSPO)is proposed,which converts the parameter optimization configuration problem of Spark Streaming into the problem of obtaining the maximum return in deep reinforcement learning model training,and a weighted state space transfer method is proposed to increase the probability of high feedback rewards for model training.Experiments on three typical streaming analysis tasks show that the performance of streaming jobs on Spark Streaming after parameter optimization is reduced by 27.93%in total scheduling time and 42%in total processing time.
作者 刘露 申国伟 郭春 崔允贺 蒋朝惠 伍大勇 LIU Lu;SHEN Guo-wei;GUO Chun;CUI Yun-he;JIANG Chao-hui;WU Da-yong(College of Computer Science and Technology,Guizhou University,Guiyang 550025,China;Guizhou Provincial Key Laboratory of Software Engineering and Information Security,Guiyang 550025,China;Iflytek Co.,Ltd.,Hefei 230011,China)
出处 《计算机与现代化》 2021年第10期49-56,62,共9页 Computer and Modernization
基金 国家自然科学基金资助项目(62062022) 贵州省科学技术基金资助项目(黔科合基础[2017]1051) 国家重点研发计划项目(2018YFC0807701)。
关键词 Spark Streaming 性能优化 深度强化学习 参数调优 Spark Streaming performance optimization deep reinforcement learning parameter tuning
  • 相关文献

参考文献4

二级参考文献66

  • 1White T. Hadoop: The definitive guide[J]. O'reilly Media Inc Gravenstein Highway North,2010,215(11):1-4.
  • 2Lakshman A,Malik P. Cassandra..A decentralized structured storage system[J]. Acre Sigops Operating Systems Review, 2010,44(2) :35-40.
  • 3Zaharia M,Chowdhury M,Franklin M J,et al. Spark:Cluster computing with working sets[C]//Proc of the 2nd USENIX Conference on Hot Topics in Cloud Computing, 2010:1765- 1773.
  • 4Seo S, Jang I, Woo K, et al. HPMR: Prefetching and pre- shuffling in shared MapReduce computation envlronment[C] //Proc of the 2009 IEEE International Conference on Cluster Computing, 2009 : 1-8.
  • 5Jiang D,Ooi B C, Shi L, et al. The performance of MapRe- duce:An in-depth study[J]. Proceedings of the VLDB En- dowment, 2010,3 (12) : 472-483.
  • 6Dittrich J. Hadoopq-q- :Making a yellow elephant run like a cheetah (without it even noticing)[J]. Proceedings of the VLDB Endowment, 2010,3 (12) : 518-529.
  • 7Shivnath B. Towards automatic optimization of MapReduce programs[C]//Proc of the 1st ACM Symposium on Cloud Computing, 2010 : 137-142.
  • 8Herodotou H,Lim H, Luo G, et al. Starfish: A self-tuning system for big data analytics[C]//Proc of the 5th Cidr Conf, 2011 : 261-272.
  • 9Shi Ju-wei,Zhou Jia, Lu Jia-heng, et al. MRTuner:A toolkit to enable holistic optimization for MapReduce )obs[C]//Proc of the VLDB Endowment, 2014,7(13) : 1319-1330.
  • 10Aaron D, Andrew O. Optimizing shuffle performance in spark [R]. CA: Berkeley-Department of Electrical Engineering and Computer Sciences, University of California, 2033.

共引文献57

同被引文献18

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部