一种全局供需感知的均值场多智能体强化学习订单分配算法

Mean⁃Field Multi⁃agent Reinforcement Learning Order Dispatch Algorithm with Awareness of Global Supply⁃Demand Dynamics

下载PDF

导出

摘要提出一种具备全局供需动态感知能力、基于均值场多智能体强化学习的网约车平台订单分配算法。该算法通过将多智能体强化学习与均值场理论相结合,提升了智能体在局部空间上相互之间的协作性;通过注入全局空间上供需的动态分布信息,提升了智能体对全局供需分布的感知和优化能力。本文构建了真实历史数据驱动的模拟器,用于算法的训练和评估。实验表明,在全天时段和高峰期时段两个不同场景下,本文提出的算法在网约车司机累计收益及订单应答率两个重要指标上均显著优于现有的订单分配算法。实验结果充分验证了本文提出算法的有效性。 This paper proposes an order dispatch algorithm of online ride-hailing platform based on meanfield multi-agent reinforcement learning with the ability to globally perceive supply-demand dynamics.Our algorithm improves the collaboration between agents in the local area by integrating multi-agent reinforcement learning with mean-field theory,and enhances the ability of agents on perceiving and optimizing the global supply-demand gap across the global area by injecting the context about global supplydemand dynamics.Besides,we built a data-driven simulator for the training and evaluation of algorithms.Extensive experiments show that in two different scenarios of a whole day and rush hour,our algorithm significantly outperforms the existing order dispatch algorithms in terms of order response rate and accumulated drivers’income.The experimental results convincingly validate the effectiveness of our algorithm.

作者宋旺胡祥张玉辉卫文江周雅诗康傲 SONG Wang;HU Xiang;ZHANG Yuhui;WEI Wenjiang;ZHOU Yashi;KANG Ao(School of Control and Computer Engineering,North China Electric Power University,Beijing 102206,China)

机构地区华北电力大学控制与计算机工程学院

出处《数据采集与处理》 CSCD 北大核心 2023年第3期652-664,共13页 Journal of Data Acquisition and Processing

基金国家自然科学基金(52078212)。

关键词多智能体强化学习均值场全局供需动态感知网约车平台订单分配 multi-agent reinforcement learning mean-field global perceive supply-demand dynamics online ride-hailing platform order dispatch

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1李月波.从静观体验到动态感知:看数字化沉浸艺术[J].文学艺术周刊,2023(8):87-90.
2周佳炜,孙宇祥,薛宇凡,项祺,吴莹,周献中.融合先验知识的异构多智能体强化学习算法研究[J].指挥控制与仿真,2023,45(3):99-107.
3全国卫生健康职业教育教学指导委员会,李媛媛,陈振锋,李晓林.公共卫生行业人才需求与职业院校专业设置匹配分析[J].中国职业技术教育,2023(6):28-38. 被引量：5
4夏家伟,朱旭芳,张建强,罗亚松,刘忠.基于多智能体强化学习的无人艇协同围捕方法[J].控制与决策,2023,38(5):1438-1447. 被引量：5
5孙鹏辉.STEAM教育理念下初中美术课程的单元化教学设计[J].基础教育研究,2023(4):72-74. 被引量：1
6潘琳,余静,王泉斌.海洋生态系统服务价值研究的文献计量与可视化分析——由功能认知、价值核算向价值实现的演变[J].中国渔业经济,2022,40(6):108-120.
7赵明飞.高校校园大安全观视域下的安全治理研究[J].进展,2023(9):49-51.
8胡耀岭,张常葆.乡村振兴背景下农村老年健康服务质量的提升路径[J].江汉学术,2023,42(3):5-13. 被引量：2
9曾繁萍.外科医生助理的职业现状与技能特性[J].管理科学与研究（中英文版）,2023,12(5):66-71.
10许越,胡琳琳,刘远立.县域医共体服务能力提升的多元实现路径研究:基于模糊集定性比较分析[J].中国全科医学,2023,26(25):3140-3146. 被引量：3

数据采集与处理

2023年第3期

浏览历史

内容加载中请稍等...

一种全局供需感知的均值场多智能体强化学习订单分配算法

相关作者

相关机构

相关主题

浏览历史