Reinforcement Learning-Based Dynamic Order Recommendation for On-Demand Food Delivery

导出

摘要 On-demand food delivery(OFD)is gaining more and more popularity in modern society.As a kernel order assignment manner in OFD scenario,order recommendation directly influences the delivery efficiency of the platform and the delivery experience of riders.This paper addresses the dynamism of the order recommendation problem and proposes a reinforcement learning solution method.An actor-critic network based on long short term memory(LSTM)unit is designed to deal with the order-grabbing conflict between different riders.Besides,three rider sequencing rules are accordingly proposed to match different time steps of the LSTM unit with different riders.To test the performance of the proposed method,extensive experiments are conducted based on real data from Meituan delivery platform.The results demonstrate that the proposed reinforcement learning based order recommendation method can significantly increase the number of grabbed orders and reduce the number of order-grabbing conflicts,resulting in better delivery efficiency and experience for the platform and riders.

作者 Xing Wang Ling Wang Chenxin Dong Hao Ren Ke Xing

机构地区 Department of Automation School of Mechanical and Automotive Engineering Meituan

出处《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第2期356-367,共12页 清华大学学报（自然科学版（英文版）

基金 supported in part by the National Natural Science Foundation of China(No.62273193) Tsinghua University-Meituan Joint Institute for Digital Life,and the Research and Development Project of CRSC Research&Design Institute Group Co.,Ltd.

关键词 on-demand food delivery order recommendation reinforcement learning actor-critic network long short term memory

分类号 TP18 [自动化与计算机技术—控制理论与控制工程] TP391.41 [自动化与计算机技术—计算机应用技术] J218.2 [艺术—美术]

引文网络
相关文献

参考文献6

1Yiming Tang,Lin Li,Xiaoping Liu.State-of-the-Art Development of Complex Systems and Their Simulation Methods[J].Complex System Modeling and Simulation,2021,1(4):271-290. 被引量：3
2Ling Wang,Zixiao Pan,Jingjing Wang.A Review of Reinforcement Learning Based Intelligent Optimization for Manufacturing Scheduling[J].Complex System Modeling and Simulation,2021,1(4):257-270. 被引量：13
3Enda Jiang,Ling Wang,Jingjing Wang.Decomposition-Based Multi-Objective Optimization for Energy-Aware Distributed Hybrid Flow Shop Scheduling with Multiprocessor Tasks[J].Tsinghua Science and Technology,2021,26(5):646-663. 被引量：12
4李春海,缪立新.区域物流系统及物流园规划方法体系[J].清华大学学报（自然科学版）,2004,44(3):398-401. 被引量：32
5刘向,李延晖.电子商务配送的跨区域VRP模型及其启发式算法[J].清华大学学报（自然科学版）,2006,46(z1):1014-1018. 被引量：15
6李立祥,柴跃廷,刘义.电子商务模式演化机理建模与经济分析[J].清华大学学报（自然科学版）,2012,52(11):1524-1529. 被引量：8

二级参考文献56

1戴汝为,李耀东.基于综合集成的研讨厅体系与系统复杂性[J].复杂系统与复杂性科学,2004,1(4):1-24. 被引量：78
2王攀,徐承志,冯珊,徐爱华.A Novel Evolutionary-Fuzzy Control Algorithm for Complex Systems[J].Journal of Systems Engineering and Electronics,2002,13(3):52-60. 被引量：1
3王飞跃,戴汝为,张嗣瀛,陈国良,汤淑明,杨东援,杨晓光,李平.关于城市交通、物流、生态综合发展的复杂系统研究方法[J].复杂系统与复杂性科学,2004,1(2):60-69. 被引量：30
4张智聪,郑力,翁小华.基于增强学习的平行机调度研究[J].计算机集成制造系统,2007,13(1):110-116. 被引量：3
5[3]Dantzig G,Ramser J.The truck dispatching problem[J].Management Science,1959,10(6):80-91.
6[4]Golden B L,Assad A A.Vehicle Routing:Methods and Studies[M].Amsterdam:North-Holland,1988.
7[5]Lenstra J K,Nemhauser G L.Handbooks in Operations Research and Management Science[M].Amsterdam:Elsevier,1995.
8[8]Lawler E L,Lenstra J K,Rinnooy Kan A H G,et al.The Traveling Salesman Problem[M].New York:John Wiley & Sons,1985.
9[9]Clarke G,Wright J W.Scheduling of vehicles from a central depot in a number of delivery points[J].Operations Research,1964,12(4):568-581.
10[10]Ronald H B,Yogesh K A.A performance comparison of several popular algorithms for vehicle routing and scheduling[J].Journal of Business Logistics,1988,9(1):51-65.

共引文献75

1屈承轩.多模式物流系统的末端配送优化算法研究[J].中国设备工程,2023(S01):153-156.
2胡钊涵.物流园区空间类型及功能定位的实证研究[J].山西财经大学学报,2012,34(S1):11-12. 被引量：4
3张永,李旭宏,毛海军.区域物流基础设施平台规划框架研究[J].交通运输系统工程与信息,2005,5(2):69-73. 被引量：4
4朱永丽,李凯.B2C电子商务物流配送车辆路径优化问题多目标模型的研究与应用[J].中国商界,2009(12):1-3. 被引量：4
5栾长涛,乐美龙.基于网络规划的道路系统设计[J].物流科技,2006,29(4):96-98.
6刘勇义,帅斌,孙朝苑.铁路枢纽区域物流规划框架研究[J].中国铁路,2006(9):27-30. 被引量：1
7秦璐,刘凯.胜利煤田物流园区功能设置研究[J].物流技术,2007,26(8):18-21.
8杨晶晶,赖明勇.望城县工业园区物流产业基础设施规划研究[J].物流科技,2007,30(10):131-133.
9汪传旭,崔建新.长江三角洲港口群物流系统动力学分析模型[J].交通运输工程学报,2007,7(5):77-83. 被引量：11
10杨龙海,安实,毛科俊.基于非线性双层规划的货运网络分配模型[J].公路交通科技,2007,24(12):109-112. 被引量：5

1Jingfang Chen,Ling Wang,Zixiao Pan,Yuting Wu,Jie Zheng,Xuetao Ding.A Matching Algorithm with Reinforcement Learning and Decoupling Strategy for Order Dispatching in On-Demand Food Delivery[J].Tsinghua Science and Technology,2024,29(2):386-399.
2Li Chunhui,Wang Chenxi.New Quality Productive Forces:New Hope for Economic Growth[J].China Report ASEAN,2024,9(2):18-21.
3DENG DI.Finite Words,Infinite Wisdom[J].China Today,2024,73(5):70-72.
4Wei Liangyi,Bai Xiaobo,Li Qin.The Impact of Digital Finance on High-Quality Development of the Chengdu-Chongqing Economic Circle[J].Contemporary Social Sciences,2024,9(2):1-15.
5Zhihao Tao,Yuxuan Song,Baochang Wang,Guoqing Tong,Liming Ding.Chemical vapor deposition for perovskite solar cells and modules[J].Journal of Semiconductors,2024,45(4):1-4.
6Du Zhanyuan.NARRATIVES BEYOND BORDERS[J].China Report ASEAN,2024,9(4):58-58.
7Lin-Jia Su,Zi-Han Ji,Mo-Xi Xu,Jia-Qing Zhu,Yi-Hai Chen,Jun-Fei Qiao,Yi Wang,Yao-Xin Lin.RNA-based nanomedicines and their clinical applications[J].Nano Research,2023,16(12):13182-13204.
8Yong Ma,Han Zhao,Kunyin Guo,Yunni Xia,Xu Wang,Xianhua Niu,Dongge Zhu,Yumin Dong.A Fault-Tolerant Mobility-Aware Caching Method in Edge Computing[J].Computer Modeling in Engineering & Sciences,2024,140(7):907-927.
9Wu Jin.Tourists VS Purists[J].China Weekly,2024(5):27-31.
10Xiao Liu,Xu Zhang,Jiulong Li,Huan Meng.Enrichment of nano delivery platforms for mRNA-based nanotherapeutics[J].Medical Review,2023,3(4):356-361.

Tsinghua Science and Technology

2024年第2期

浏览历史

内容加载中请稍等...

Reinforcement Learning-Based Dynamic Order Recommendation for On-Demand Food Delivery

参考文献6

二级参考文献56

共引文献75

相关作者

相关机构

相关主题

浏览历史