The modern complicated manufacturing industry and smart manufacturing tendency have imposed new requirements on the scheduling method,such as self-regulation and self-learning capabilities.While traditional scheduling...The modern complicated manufacturing industry and smart manufacturing tendency have imposed new requirements on the scheduling method,such as self-regulation and self-learning capabilities.While traditional scheduling methods cannot meet these needs due to their rigidity.Self-learning is an inherent ability of reinforcement learning(RL) algorithm inhered from its continuous learning and trial-and-error characteristics.Self-regulation of scheduling could be enabled by the emerging digital twin(DT) technology because of its virtual-real mapping and mutual control characteristics.This paper proposed a DT-enabled adaptive scheduling based on the improved proximal policy optimization RL algorithm,which was called explicit exploration and asynchronous update proximal policy optimization algorithm(E2APPO).Firstly,the DT-enabled scheduling system framework was designed to enhance the interaction between the virtual and the physical job shops,strengthening the self-regulation of the scheduling model.Secondly,an innovative action selection strategy and an asynchronous update mechanism were proposed to improve the optimization algorithm to strengthen the self-learning ability of the scheduling model.Lastly,the proposed scheduling model was extensively tested in comparison with heuristic and meta-heuristic algorithms,such as wellknown scheduling rules and genetic algorithms,as well as other existing scheduling methods based on reinforcement learning.The comparisons have proved both the effectiveness and advancement of the proposed DT-enabled adaptive scheduling strategy.展开更多
基金supported by the National Key R&D Program of China(Grant No.2020YFB1713300)the Joint Open Fund of Wuhan Textile University (Grant No.KT202201005)+1 种基金the Foundation of Key Laboratory of Advanced Manufacturing Technology,Ministry of EducationGuizhou University (Grant No.GZUAMT2021KF11)。
文摘The modern complicated manufacturing industry and smart manufacturing tendency have imposed new requirements on the scheduling method,such as self-regulation and self-learning capabilities.While traditional scheduling methods cannot meet these needs due to their rigidity.Self-learning is an inherent ability of reinforcement learning(RL) algorithm inhered from its continuous learning and trial-and-error characteristics.Self-regulation of scheduling could be enabled by the emerging digital twin(DT) technology because of its virtual-real mapping and mutual control characteristics.This paper proposed a DT-enabled adaptive scheduling based on the improved proximal policy optimization RL algorithm,which was called explicit exploration and asynchronous update proximal policy optimization algorithm(E2APPO).Firstly,the DT-enabled scheduling system framework was designed to enhance the interaction between the virtual and the physical job shops,strengthening the self-regulation of the scheduling model.Secondly,an innovative action selection strategy and an asynchronous update mechanism were proposed to improve the optimization algorithm to strengthen the self-learning ability of the scheduling model.Lastly,the proposed scheduling model was extensively tested in comparison with heuristic and meta-heuristic algorithms,such as wellknown scheduling rules and genetic algorithms,as well as other existing scheduling methods based on reinforcement learning.The comparisons have proved both the effectiveness and advancement of the proposed DT-enabled adaptive scheduling strategy.