期刊文献+

Spark效用感知的检查点缓存并行清理策略

Parallel Cleaning Strategy of Checkpoint Cache Based on Spark Utility Aware
下载PDF
导出
摘要 针对Spark检查点缓存数据清理需要等待作业运行完成后由编程人员清理,可能导致产生失效数据累积占用内存问题,本文分析检查点执行机制,建模推导出随着检查点数量增多,检查点缓存清理方法不可扩展,提出使用检查点缓存效用熵模型感知检查点缓存和内存槽的匹配度,并利用效用最佳匹配原则,推导出最佳检查点缓存清理最佳时机.基于效用熵的检查点缓存并行清理(PCC)策略,通过使检查点缓存清理时刻近似等于检查点写入HDFS时刻优化内存资源.实验结果表明,在基于公平调度的多作业执行环境下,随着检查点数量增加,未优化程序执行效率变差,使用PCC策略后,在程序执行时长、耗电量、GC时间3个指标上最大分别能降低10.1%、9.5%、19.5%,有效提升多检查点时的程序执行效率. In view of the fact that the cache data cleaning of Spark checkpoint needs to be cleaned by the programmer after the job is completed, which may lead to memory accumulation of the failure data. This study analyzes the execution mechanism of checkpoint, deduces that the checkpoint cache cleaning method is not extensible with the increase of the number of check points. The matching degree between checkpoint cache and memory slot is measured by using the utility entropy model of checkpoint cache. The optimal checkpoint cache cleaning time is derived by using the principle of best utility matching. The PCC strategy based on utility entropy optimizes memory resources by making the checkpoint cache clean-up time approximately equal to the time when the checkpoint is written to HDFS. The experimental results show that in the multi-job execution environment based on fair scheduling, with the increase of the number of checkpoints, the execution efficiency of the unoptimized program becomes worse. After using PCC strategy, the program execution time,power consumption and GC time can be reduced by 10.1%, 9.5% and 19.5%, respectively. Effectively improve the efficiency of multi-checkpoint program execution.
作者 宋一鑫 于俊洋 何欣 王锦江 SONG Yi-Xin;YU Jun-Yang;HE Xin;WANG Jin-Jiang(Institute of Software,Henan University,Kaifeng 475004,China)
出处 《计算机系统应用》 2022年第4期253-259,共7页 Computer Systems & Applications
基金 河南省科技研发项目(212102210078)。
关键词 缓存清理 SPARK 效用熵 失效检查点 并行清理 大数据 cache cleaning Spark utility entropy failure checkpoint parallel cleaning big data
  • 相关文献

参考文献6

二级参考文献14

共引文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部