期刊文献+

基于时间序列分析的杀手级任务在线识别方法 被引量:2

Time Series Based Killer Task Online Recognition Approach
下载PDF
导出
摘要 通过分析Google集群中任务的失效次数和失效模式,找到具有高失效频次和连续失效特征的杀手级任务。杀手级任务不仅影响云计算系统上应用运行的可靠性与可用性,而且会浪费大量资源并显著增加调度负载。在杀手级任务资源使用模式的基础上,提出一种基于时间序列的在线识别方法,以利用资源使用时间序列在失效早期准确识别出杀手级任务并通知云计算系统采取前摄性失效恢复措施,从而避免不必要的重复调度和资源浪费。实验结果表明,该方法能够以98.5%的准确率在平均3%的失效时间内识别出杀手级任务,同时节约96.75%的系统资源。 By analyzing failure frequency and failure patterns in Google cluster dataset,this paper fond what are called as killer tasks that suffer from frequent and continuous failure.Killer task is a big concern of cloud system as it causes unnecessary resource wasting and significant increase of scheduling overhead.In this paper,an online recognition approach was proposed to make use of the resource usage time series to recognize killer tasks precisely at the very early stage of their occurrence so that proactive actions can be taken to avoid rescheduling and resource wasting.The experiment results show that the proposed approach performs a 98.5% precision in recognizing killer tasks at 3% of failure duration,with a 96.75% resource saving for the cloud system averagely.
出处 《计算机科学》 CSCD 北大核心 2017年第4期43-46,共4页 Computer Science
基金 深圳市科技计划重点项目(JSGG20140516162852628)资助
关键词 云计算系统 杀手级任务 在线识别 时间序列 资源使用模式 失效频率 Cloud system Killer tasks Online recognition Time series Resource usage pattern Failure frequency
  • 相关文献

参考文献3

二级参考文献21

  • 1Candea G, Kawamoto S, Fujiki Y et al. Microreboot--A technique for cheap reeovery//Proceedings of the 6th Confer- ence on Symposium on Opearting Systems Design & Imple- mentation-Volume 6. San Francisco, USA, 2004:3.
  • 2Lin T T Y, Siewiorek D P. Error log analysis: Statistical modeling and heuristic trend analysis. IEEE Transactions on Reliability, 1990, 39(4): 419-432.
  • 3Yuan D, Mai H, Xiong W et al. SherLog: Error diagnosis by connecting clues from run-time logs//Proceedings of the 15th Edition of ASPLOS on Architectural Support for Pro- gramming Languages and Operating Systems. Pittsburgh, Pennsylvania, USA, 2010:143-154.
  • 4Zheng A X, Lloyd J, Brewer E. Failure diagnosis using deci- sion trees//Proeeedings of the 1st International Conference on Autonomie Computing. Limassol, Cyprus, 2004:36-43.
  • 5Tan J, Kavulya S, Gandhi R et al. Visual, Log-based causal tracing for performance debugging of MapReduce systems// Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems. Genoa, Italy, 2010: 795-806.
  • 6Zheng Z, Lan Z, Park B H et al. System log pre-processing to improve failure prediction//Proceedings of the IEEE/IFIP International Conference on Dependable Systems & Net- works(DSN'09). Lisbon, Poltugal, 2009:572-577.
  • 7Reidemeister T, Munawar M A, Jiang Met al. Diagnosis of recurrent faults using log files//Proeeedings of the 2009 Con- ference of the Center for Advanced Studies on Collaborative Research. Ontario, Canada, 2009: 12-23.
  • 8Chen M Y, Kiciman E, Fratkin E et al. Pinpoint: Problem determination in large, dynamic internet services//Proceed- ings of the 2002 International Conference on Dependable Sys- tems and Networks. Bethesda, USA, 2002:595-604.
  • 9Barham P, Donnelly A, Isaaes R et al. Using magpie for request extraction and workload modelling//Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation-Volume 6. San Francisco, USA, 2004:18.
  • 10Tan P N, Steinbach M, Kumar V. Introduction to Data Mining. Bostom Pearson Addison Wesley, 2006.

共引文献305

同被引文献2

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部