期刊文献+

基于深度强化学习和知识迁移的飞机装配脉动生产线调度方法

Scheduling approach for aircraft assembly pulsation production lines with deep reinforcement learning and knowledge transfer
原文传递
导出
摘要 飞机装配是飞机制造中的关键环节,如何对飞机装配脉动生产线进行合理调度,实现降本增效,是智能制造领域的重要科学问题.然而,飞机装配脉动生产线场景复杂,装配单架飞机就包含上万道工序,这为飞机装配调度问题的形式化建模和高效求解带来新的挑战,因而当前生产实践中主要依靠人类专家经验进行手工调度.本文聚焦降低人力负载的优化目标,提出两种领域特定的技术以解决飞机装配调度问题.首先,将飞机装配脉动生产线调度问题建模为两个马尔可夫(Markov)决策过程,通过双重强化学习智能体决策生成飞机装配的近似调度方案.其次,针对强化学习决策鲁棒性不足的缺陷,提出领域知识迁移方法,将强化学习的求解知识迁移到整数规划约束剪枝中,最后利用整数规划求解器优化得到综合性能优异的调度方案.在飞机装配生产线的真实数据上完成了实验验证,结果表明本文提出的基于深度强化学习和知识迁移的调度方法能够成功扩展到年产量近百架次的飞机装配脉动生产线调度问题,将组合优化方法难以求解的问题优化到分钟级求解,相较于基线方法取得显著性能优势. Aircraft assembly is a critical process in aircraft manufacturing.Scheduling the assembly pulsation production lines of aircraft assembly in a rational manner for cost reduction and efficiency improvement is an important scientific problem in the intelligent manufacturing field.However,the scenario of aircraft assembly lines is complex,with each assembly involving tens of thousands of operations,which poses new challenges for formally modeling and efficiently solving the aircraft assembly scheduling problem.Thereby,current industry practices heavily rely on manual scheduling through the expertise of human professionals.This paper aims to minimize human resource load and proposes two domain-specific techniques to address the scheduling problem of aircraft assembly pulsation lines.Firstly,the scheduling problem of aircraft assembly pulsation production lines is modeled as two Markov decision processes,and a bi-level reinforcement learning agent is used to make decisions on feasible scheduling solutions for aircraft assembly.Secondly,to tackle the problem of robustness deficiency in reinforcement learning decisions,a domain-knowledge transfer paradigm is proposed,whereas the problem-solving knowledge obtained via reinforcement learning is transferred to the constraint pruning process of the integer linear programming model,and the final scheduling solutions with excellent overall performance are attained through an integer programming solver.Experiments are conducted on real scheduling data from aircraft assembly pulsation production lines.Results demonstrate that the proposed scheduling method based on reinforcement learning and knowledge transfer can successfully scale up to scheduling the assembly pulsation production lines with a yield of nearly one hundred aircraft per year,a problem intractable for combinatorial optimization methods.The solving time of the proposed method is reduced to minutes,and the performance exhibits significant advantages compared to baseline methods.
作者 钟金成 马浩宇 龙明盛 王建民 Jincheng ZHONG;Haoyu MA;Mingsheng LONG;Jianmin WANG(School of Software,Tsinghua University,Beijing 100084,China;Beijing National Research Center for Information Science and Technology,Beijing 100084,China)
出处 《中国科学:信息科学》 CSCD 北大核心 2024年第6期1441-1457,共17页 Scientia Sinica(Informationis)
基金 科技创新2030—“新一代人工智能”重大项目(批准号:2020AAA0109201) 国家自然科学基金(批准号:62021002,62022050) 北京市科技新星计划(批准号:Z201100006820041)资助项目。
关键词 飞机装配 智能调度 组合优化 强化学习 知识迁移 aircraft assembly intelligent scheduling combinatorial optimization reinforcement learning knowledge transfer
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部