基于强化学习的多技能项目调度算法

Reinforcement learning-based algorithm for multi-skill project scheduling problem

下载PDF

导出

摘要多技能项目调度存在组合爆炸的现象,其问题复杂度远超传统的单技能项目调度,启发式算法和元启发式算法在求解多技能项目调度问题时也各有缺陷.为此,根据项目调度的特点和强化学习的算法逻辑,本文设计了基于强化学习的多技能项目调度算法.首先,将多技能项目调度过程建模为符合马尔科夫性质的序贯决策过程,并依据决策过程设计了双智能体机制.而后,通过状态整合和行动分解,降低了价值函数的学习难度.最后,为进一步提高算法性能,针对资源的多技能特性,设计了技能归并法,显著降低了资源分配算法的时间复杂度.与启发式算法的对比实验显示,本文所设计的强化学习算法求解性能更高,与元启发式算法的对比实验表明,该算法稳定性更强,且求解速度更快. Combinatorial explosion is a common phenomenon in multi-skill project scheduling,which leads to higher complexity in multi-skill project scheduling problem(MSPSP)than in traditional single-skill project scheduling problem.Heuristics and meta-heuristics have disadvantages in solving MSPSP.Therefore,based on the characteristics of project scheduling and the algorithmic logic of reinforcement learning,a multi-skilled project scheduling algorithm based on reinforcement learning is designed in this paper.Firstly,the multi-skill project scheduling process is modeled as a Markov decision process(MDP).Then,a double-agent mechanism is proposed,and state integration method and action decomposition method are designed to reduce the complexity of value function learning.Finally,skills conflation algorithm is developed to reduce the time complexity of allocating resources in MSPSP.Comparative experiments between the proposed RL algorithm and heuristics show that the reinforcement learning(RL)has better performance,and experiments between the proposed RL algorithm and meta-heuristics show that the RL has higher stability and shorter running time.

作者胡振涛崔南方胡雪君雷晓琪 HU Zhen-tao;CUI Nan-fang;HU Xue-jun;LEI Xiao-qi(School of Management,Huazhong University of Science and Technology,Wuhan Hubei 430074,China;Business School,Hunan University,Changsha Hunan 410082,China)

机构地区华中科技大学管理学院湖南大学工商管理学院

出处《控制理论与应用》 EI CAS CSCD 北大核心 2024年第3期502-511,共10页 Control Theory & Applications

基金国家自然科学基金项目(71971094,71701067,72071075) 湖南省自然科学基金项目(2019JJ50039)资助.

关键词多技能资源项目调度智能算法强化学习并行调度 multi-skill resource project scheduling intelligence algorithm reinforcement learning PSGS

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

1何健翔.红岭七辨--重织校园和城市经验[J].世界建筑导报,2022,37(5):59-61.
2罗钦,黄杉,宋剑伟,曾翠峰,陈菁菁,李伟.基于资源约束的地铁运营施工任务调度研究[J].交通运输系统工程与信息,2024,24(2):188-198.
3夏超,欧阳平,李明,屈盈飞,郭玮峰.基于混沌精英和Lévy飞行策略的鲸鱼优化算法[J].计算机技术与发展,2024,34(4):180-186.
4张博,陈志敏,张利平.基于改进遗传算法的船舶维修项目调度问题研究[J].中国修船,2024,37(2):36-39.
5李道军,李廷锋,卢青波.混合整数优化问题的差分进化算法研究[J].机械工程师,2024(4):109-112.
6崔铭悦,莫愿斌,王子豪,胡飓风.基于非线性自适应比例因子的雪豹优化算法[J].计算机技术与发展,2024,34(4):212-220.
7崔博文,陶成蹊.基于SVD的复数UKF及电力系统对称分量估计[J].船电技术,2024,44(4):1-5.
8王焕松,张荣娜,于胜利,郭祥,娄燕芳.基于多源影音数据融合的铁路工程调度指挥平台关键技术研究及应用——以广州白云站工程为例[J].铁道标准设计,2024,68(5):153-159.
9孙道萃.人工智能辅助定罪的进展、理论与应用[J].华南师范大学学报（社会科学版）,2024(2):117-137.
10代成刚,杨其华,袁月峰,李锐鹏.基于改进SMO与变参数LTD的PMSM无传感器控制[J].组合机床与自动化加工技术,2024(4):105-109.

控制理论与应用

2024年第3期

浏览历史

内容加载中请稍等...

基于强化学习的多技能项目调度算法

相关作者

相关机构

相关主题

浏览历史