摘要
场桥是自动化码头堆场中的核心作业机械,场桥的合理调度是集装箱作业效率提升的关键。针对场桥调度问题具有的复杂时空耦合特性和高度的动态性,以最小化自动导引车(Automatic guided vehicle,AGV)和外集卡的等待时间为优化目标构建数学规划模型,并提出一种新颖的深度强化学习方法进行求解。算法设计贴近实际堆场作业环境的智能体,并在智能体与环境的交互部分通过指针网络、注意力机制和演员-评论家(Actor-critic,A-C)架构的设计提高了获取状态中的隐藏模式的能力。在基于洋山四期自动化码头实际数据生成的不同规模的算例上展开试验,所提算法能实现场桥调度方案的高效输出,相较于一些启发式规则算法有17%左右的性能提升。试验结果表明所提调度方法是有效且优越的,能够在实际中为堆场作业提供动态决策支持。
As the core working machinery of automated terminal yard,the dispatching of yard crane is the key to improve the efficiency of container operation.In order to minimize the waiting time of AGVs and external container trucks,a mathematical programming model for the yard crane scheduling problem is established considering complex spatio-temporal coupling characteristics and high dynamic,and a novel deep reinforcement learning method is proposed to solve the problem.The algorithm describes the yard environment close to reality through the agent definition,and improves the ability of extracting hidden state patterns through pointer network,attention mechanism and A-C architecture in the interaction design between the agent and the environment.Experiments are carried out on examples of different scales based on the actual data of Yangshan Phase IV Automated Terminal.The results show that the proposed algorithm can provide an approximately optimal crane scheduling scheme in a relatively short time,and the performance of it is about 17%better compared with state-of-art heuristic rule algorithms.Therefore,the proposed scheduling method is effective and superior,and it can provide dynamic decision support for yard operation in practice.
作者
王无印
黄子钊
庄子龙
方怀瑾
秦威
WANG Wuyin;HUANG Zizhao;ZHUANG Zilong;FANG Huaijin;QIN Wei(Institute of Industrial Engineering and Management,Shanghai Jiao Tong University,Shanghai 200240;Shanghai International Port(Group)Co.,Ltd.,Shanghai 200080)
出处
《机械工程学报》
EI
CAS
CSCD
北大核心
2024年第6期44-57,共14页
Journal of Mechanical Engineering
基金
国家重点研发计划资助项目(2019YFB1704401)。
关键词
自动化集装箱码头
堆场
场桥调度
深度强化学习
automated container terminal
yard
yard crane scheduling
deep reinforcement learning