摘要
针对订单生产型企业在订单接受决策过程中的不确定性,基于强化学习的思想,在考虑生产成本、延迟惩罚成本以及拒绝成本的前提下,引入顾客等级这一要素,从收益管理的角度建立了基于半马尔可夫决策过程的订单接受模型.在此基础上,提出了基于SMART算法的最优订单接受策略求解方法,旨在最大化订单生产型企业的长期利润.仿真实验结果表明:基于SMART算法得到的订单接受策略要优于基于先来先服务方法得到的订单接受策略;同时,针对考虑顾客等级的仿真实验及数据分析结果,也验证了引入顾客等级这一要素的必要性和重要性.
From the perspective of revenue management, a semi-Markov decision process based order acceptance model (SMDP-OA model) is proposed on the basis of reinforcement learning. This model is to solve the uncertainties during order accepting decision processes for make-to-order (MTO) compa- nies, not only taking into account the production cost, delay cost and reject cost of the incoming order, but also the factor of customer level. Besides, SMART-based optimal order acceptance algorithm is pre- sented, aiming at maximizing the profit of MTO companies. The simulation experiments indicate that the proposed SMART-based algorithm performs better than the algorithm based on the first-come-first-serve (FCFS) order acceptance strategy. Moreover, the experiments also justify the necessity and importance of incorporating the customer level factor during the determination of the optimal order acceptance policy.
出处
《系统工程理论与实践》
EI
CSSCI
CSCD
北大核心
2014年第12期3121-3129,共9页
Systems Engineering-Theory & Practice
基金
国家自然科学基金(71201020)
中央高校基本科研业务经费(N120406002)
中国博士后科学基金(2013M540233)
关键词
收益管理
订单接受
SMART算法
平均利润
强化学习
revenue management
order acceptance
SMART algorithm
average profit
reinforcementlearning