期刊文献+

基于深度增强学习的数据中心网络coflow调度机制 被引量:8

Deep Reinforcement Learning Based Coflow Scheduling in Data Center Networks
下载PDF
导出
摘要 最小化语义相关流的平均完成时间是数据中心网络流量管理面临的难题之一.受人工智能领域深度增强学习方向的最新研究进展启发,本文提出一种的新的语义相关流调度机制.将带宽约束的语义相关流调度问题转化为连续的学习过程,通过学习以往策略实现最佳调度.引入反向填充和有限复用机制,保证系统的工作保持性和无饥饿性.仿真结果表明,在不同的网络负载下,本文提出的调度机制均使得语义相关流的平均完成时间小于其他调度机制,尤其是网络负载较大时,相比最先进的调度机制,性能提升约50%. Coflow completion time minimization is one of the challenges of traffic management in data center networks. Inspired by the newest research progress in deep reinforcement learning,which is one direction of artificial intelligence,this paper proposes a novel coflow scheduling mechanism. It translates the coflow scheduling problem with bandwidth constraint into a continuous learning process. By learning the previous decisions,the best scheduling is obtained. By introducing back filling and limited multiplexing mechanisms,the system is work-conserving and starvation-free. Simulation results showthat,under different network load,compared with other scheduling mechanisms,the average coflow completion time is reduced. Especially when the network load is heavy,the proposed mechanism achieves about 50% performance improvement than the state-of-the-art scheduling mechanism.
作者 马腾 胡宇翔 张校辉 MA Teng;HU Yu-xiang;ZHANG Xiao-hui(National Digital Switching System Engineering & Technology Research Center,Zhengzhou,Henan 450002,Chin)
出处 《电子学报》 EI CAS CSCD 北大核心 2018年第7期1617-1624,共8页 Acta Electronica Sinica
基金 国家973重点基础研究发展计划(No.2013CB329104) 国家863高技术研究发展计划(No.2013AA013505)
关键词 数据中心网络 语义相关流 流调度 data center network coflow flow scheduling
  • 相关文献

参考文献1

二级参考文献120

  • 1MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-levelcontrol through deep reinforcement learning [J]. Nature, 2015,518(7540): 529 – 533.
  • 2SILVER D, HUANG A, MADDISON C, et al. Mastering the gameof Go with deep neural networks and tree search [J]. Nature, 2016,529(7587): 484 – 489.
  • 3AREL I. Deep reinforcement learning as foundation for artificialgeneral intelligence [M] //Theoretical Foundations of Artificial GeneralIntelligence. Amsterdam: Atlantis Press, 2012: 89 – 102.
  • 4TEAAURO G. TD-Gammon, a self-teaching backgammon program,achieves master-level play [J]. Neural Computation, 1994,6(2): 215 – 219.
  • 5SUTTON R S, BARTO A G. Reinforcement Learning: An Introduction[M]. Cambridge MA: MIT Press, 1998.
  • 6KEARNS M, SINGH S. Near-optimal reinforcement learning inpolynomial time [J]. Machine Learning, 2002, 49(2/3): 209 – 232.
  • 7KOCSIS L, SZEPESVARI C. Bandit based Monte-Carlo planning[C] //Proceedings of the European Conference on MachineLearning. Berlin: Springer, 2006: 282 – 293.
  • 8LITTMAN M L. Reinforcement learning improves behaviour fromevaluative feedback [J]. Nature, 2015, 521(7553): 445 – 451.
  • 9BELLMAN R. Dynamic programming and Lagrange multipliers[J]. Proceedings of the National Academy of Sciences, 1956,42(10): 767 – 769.
  • 10WERBOS P J. Advanced forecasting methods for global crisis warningand models of intelligence [J]. General Systems Yearbook, 1977,22(12): 25 – 38.

共引文献130

同被引文献46

引证文献8

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部