期刊文献+

基于马氏决策过程的易逝品联合策略 被引量:3

Jointed decisions for perishable product with Markov decision process
下载PDF
导出
摘要 为了有效解决零售商在销售易逝品时的订货、旧产品处理及定价的联合决策问题,提出运用马氏决策过程建立模型及使用Q学习算法求得最优策略。最优策略包括各个状态下选择的决策动作,它能使从现在起及后续无限期的贴现总值为最大。算法中的迭代公式通过不断与环境进行互动并得到反馈,时刻更新最优策略。基于有限的状态集和动作集,在状态转移概率及当期期望收益未知的情况下,算法经过长时间学习后能够得到稳定的最优策略。研究发现,各参数(变化)对联合策略中各策略的特征有不同的影响,该结论为启发式策略的相关研究提供了一定的理论支持和解决思路。 To solve the jointed decisions problem of ordering,pricing and old products disposing faced for selling perishable products with a multi-period shelf life over an infinite horizon effectively,a model with Markov decision theory was established and the optimal policy was computed by using Q-learning algorithm.The optimal policy indicated the action of all states which could maximize the long-run discounted expected profit from current period.Through interacting with the environment and obtaining the feedback continuously,the iterate formula of algorithm renewed the optimal policy constantly.The stationary optimal policy would be computed after sufficient learning under situation of state and action space were finite and discrete,while the state transition probability and expected profit were not necessarily be known.The research showed that the different parameters had different and significant impact on the characteristic of each decision,and the conclusion provided some support and thought for researches of heuristic strategy.
出处 《计算机集成制造系统》 EI CSCD 北大核心 2017年第1期144-153,共10页 Computer Integrated Manufacturing Systems
基金 广东省自然科学基金资助项目(2016Z00052)~~
关键词 易逝品 马氏决策过程 Q学习算法 订货策略 定价策略 perishable product Markov decision process Q-learning algorithm ordering decisions pricing decisions
  • 相关文献

参考文献2

二级参考文献16

  • 1陈剑,肖勇波,刘晓玲,陈友华.基于乘客选择行为的航空机票控制模型研究[J].系统工程理论与实践,2006,26(1):65-75. 被引量:16
  • 2EECKHOUDT L, GOLLIER C. SCHLESINGER H. The risk averse (and prudent) newsboy[J]. Management Science, 1995, 41:786-794.
  • 3LAU A, LAU H. The newsboy problem with price-dependent demand distribution[J]. IIE Transactions, 1988, 20:168-175.
  • 4FISHER M, RAMAN A. Reducing the cost of demand uncertainty through accurate response to early sales[J]. Operations Research, 1996, 44(1): 87-99.
  • 5LYER A, BERGEN M. Quick response in manufacturer retailer channels[J]. Management Science, 1997, 43: 559-570.
  • 6Eppen G D, Iyer A V. Improved fashion buying with Bayesian updates[J]. Operations Research, 1997,45:805-819.
  • 7SMITH S A, AGREWAL N, MCLNTYRE S H. A discrete optimization model for seasonal merchandise planning[J]. Journal of Retailing, 1998,74:193-221.
  • 8CHUNG K. Risk in inventory models:the case of the newsboy problem, optimality conditions[J]. Journal of Operational Research Society, 1990, 41:173-176.
  • 9LAU A, LAU H. The newsstand problem:a capacitated multi-product single period inventory problem[J]. Operations Research, 1996, 94:29-42.
  • 10CHANG P L, LIN C T. On the effect of centralization of the expected costs in a multi-location newsboy problem[J]. Journal of Operational Research Society, 1991, 42:1025-1030.

共引文献26

同被引文献18

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部