期刊文献+

面向数据中心间网络带宽的在线定价机制设计:基于强化学习的方法 被引量:3

Reinforcement Learning-Based Online Pricing for Inter-Datacenter Bandwidth
下载PDF
导出
摘要 随着云服务的快速发展,数据中心间的网络带宽已经成为了宝贵的资源.目前,带宽以固定价格出售,并由流量工程进行容量分配.这种方式无法提供传输时间的保证.然而,大多数传输任务都有严格的截止时间限制,未及时完成传输会给用户造成较大的收益损失.此外,对于服务提供商来说,固定价格定价机制将导致网络利用率低下,不利于最大化其长期累积收益.因此,本文希望设计在线定价机制,最大化服务提供商的收益,同时保证用户传输请求的真实性.本文首先提出基于价格表的在线定价机制PLPM:服务提供商向用户实时地展示价格表,由用户选择特定的传输时间和传输量以满足其传输要求.PLPM利用强化学习的方法更新价格表,实现对不同时间、不同容量状态的带宽进行定价.进一步地,本文提出了基于请求的在线定价机制RPM:用户可以自定义其传输类型、传输量和截止时长.RPM在线地学习用户的估价以实现收益最大化和真实性,并匹配了基于优先级的带宽分配策略以提升网络带宽利用率.最后,通过实验证明了所提出的定价机制相比于固定定价机制,可以大幅提高累积收益和网络带宽的利用率. With the rapid development of cloud-based services,bandwidth between datacenters has become a valuable resource.Currently,the bandwidth is sold with a fixed price and is allocated using traffic engineering mechanisms,which have no guarantee on when the transfers can be finished.However,most transfer tasks have strict deadlines,while not finishing them on time may result in high losses to customers.In addition,trivially fixed pricing strategies will cause network inefficiency and cannot maximize service providers’revenues.We thus focus on how to design online pricing mechanisms for inter-datacenter transfers,while achieving revenue maximization and truthfulness.We first propose Price List-Based Online Pricing Mechanism(PLPM),providing customers with a price list to select specific time slots and amounts.In particular,discriminative prices for bandwidth at different time slots with corresponding capacities are calculated by reinforcement learning methods.We then extend PLPM to Request-Based Online Pricing Mechanism(RPM),which allows customers to request customized data transfers.RPM online explores and exploits the valuations of customers for revenue maximization and meanwhile guarantees truthfulness against rational customers.Allocation rules with priorities are designed to improve network utilization.We finally conduct experiments and demonstrate that the proposed mechanisms outperform fixed pricing from cumulative revenue and network utilization.
作者 牛超越 陈培煜 张嘉懿 吴帆 陈贵海 NIU Chao-Yue;CHEN Pei-Yu;ZHANG Jia-Yi;WU Fan;CHEN Gui-Hai(Department of Computer Science and Engineering,Shanghai Jiaotong University,Shanghai 200240)
出处 《计算机学报》 EI CAS CSCD 北大核心 2022年第5期1068-1086,共19页 Chinese Journal of Computers
基金 科技创新2030-“新一代人工智能”重大项目(2018AAA0100900) 国家自然科学基金项目(62025204,61972254) 阿里巴巴创新研究计划资助.
关键词 在线定价 数据中心间网络带宽 强化学习 收益最大化 真实性 online pricing inter-datacenter bandwidth reinforcement learning revenue maximization truthfulness
  • 相关文献

参考文献5

二级参考文献43

  • 1田厚平,郭亚军,王学军.一类基于进化博弈的多主多从Stackelberg对策算法[J].系统工程学报,2005,20(3):303-307. 被引量:10
  • 2张尧学,盖峰.高速信息网络关键技术──成组广域广播与QoS控制[J].电子学报,1995,23(10):32-36. 被引量:2
  • 3D Fudenberg, J Tirole. Game Theory [ M ]. Cambridge: MIT Press,1991.
  • 4P B Key, D R McAuley. Differential QoS and pricing in networks:where flow control meets game theory [ J]. Software, IEE Proceedings,1999,146(1) :39 -43.
  • 5Y A Korilis, T A Varvarigou, S R Ahuja. Incentive compatible pricing strategies in noncooperative networks [A]. In Proc. IEEE INFOCOM'98 [ C]. San Francisco, 1998.439 - 446.
  • 6J K MacKie-Mason, H R Varian.Pricing the intemet [A] .In B. Kahin and J. Keller, editors, Public Access to the Internet [ C]. London, UK:Prentice Hall, 1994.
  • 7D Clark. Intemet cost allocation and pricing [A] .L W McKnight, J P Bailey, editors. Intemet Economics [ C]. Cambridge: MIT Press, 1997.
  • 8S Shenker et al. Pricing in computer networks: reshaping the research agenda [J]. Computer Comm. Rev, 1996,26(2) : 123 - 133.
  • 9A A Lazar, N. Semret. Design and analysis of the progressive second price auction for network bandwidth sharing [ A ]. Telecoranamication Systems, Special issue on Network Economics [ C ]. New York: Prentice Hall, 1999.
  • 10Back K, Zender J F. Auctions of divisible goods:on the rationale for the Treasury experiment [ J ]. Review of Financial, 1993, Studies 6:733 -764.

共引文献75

同被引文献34

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部