期刊文献+

一种基于修正机制和强化学习的作业车间调度问题的优化算法 被引量:2

Optimization Algorithms for Job Shop Scheduling Problems Based on Correction Mechanisms and Reinforcement Learning
下载PDF
导出
摘要 近年来,使用深度强化学习解决作业车间调度问题的研究主要集中于构造法,通过将作业车间调度问题视为顺序决策问题,逐步选择调度节点从而得到完整的解。尽管这种算法思想已经取得了不小的成果,但仍面临奖励构造困难、解决方案质量不高的问题,因此这一方法的发展受到制约。针对这些问题,设计了一种基于图神经网络和近端策略优化算法的强化学习构造框架。同时,针对因训练与测试数据分布不一致而带来的次优解问题,还设计了一种修正交换算子,以保证解的质量。最后,为了证明算法的有效性,在公开数据集和生成的数据集上进行了实验。实验结果表明,所提算法在中小规模实例上的结果优于目前最好的强化学习框架,不仅充分发挥了构造式强化学习框架求解迅速的优势,还通过修正机制有效缓解了次优选择问题,缩短了实例的最大完成时间。 In recent years,research on using deep reinforcement learning to solve job shop scheduling problems has concentrated on construction techniques,which model the scheduling problem as sequential choice problems and gradually select scheduling nodes for a complete solution.Although this algorithmic theory has produced impressive results,it still suffers from complicated reward formulation and poor solution quality,which prevents its future development.In this study,we design a reinforcement learning construction framework based on graphical neural networks and proximal policy optimisation algorithms,and an innovative and efficient search correction mechanism with a modified swap operator is proposed to enhance the solution quality.It searches the area around a known solution using a Monte Carlo tree,correcting the issue of suboptimal solution selection caused by the discrepancy between training and testing data.The proposed algorithm is comprehensively investigated on public and synthetic datasets.Experimental results demonstrate that the algorithm outperforms the state-of-the-art reinforcement learning framework on both small and medium-sized examples.It not only fully exploits the advantages of rapid solution of constructive reinforcement learning framework,but also effectively corrects the sub-optimal choice through the correction mechanism,reducing the maximum completion time in worst cases.
作者 苗宽 李崇寿 MIAO Kuan;LI Chongshou(School of Artificial Intelligence and Computing,Southwest Jiaotong University,Chengdu 610097,China;SWJTU-Leeds Joint School,Southwest Jiaotong University,Chengdu 610097,China)
出处 《计算机科学》 CSCD 北大核心 2023年第6期274-282,共9页 Computer Science
基金 国家自然科学基金(62202395) 四川省自然科学基金(2022NSFSC0930) 中央高校基本科研业务费专项资金(2682022CX067) 四川省重点研发项目(2022YFG0028)。
关键词 调度 作业车间调度问题 强化学习 修正搜索算法 Scheduling Job shop scheduling problems Reinforcement learning Modified search algorithms
  • 相关文献

同被引文献15

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部