AC-HAPE3D:基于强化学习的异形填充算法

AC-HAPE3D: an algorithm for irregular packing based on reinforcement learning

下载PDF

导出

摘要在3D打印、快递物流等领域,需要将形状各异的零件或货物在限定的空间中摆放,称为异形填充。给出一种摆放方案,以便将尽可能多的多面体放入给定容器;或者一批物体紧密地摆放,使得占用体积最小,则称为异形填充问题。这是个NP问题,很难高效求解。基于此,研究在一个可变维度的三维容器内摆放给定的一组多面体,使得打包后容器的可变维度最小。并提出一个基于强化学习的算法AC-HAPE3D,利用启发式算法HAPE3D将问题建模为马尔可夫过程,再利用基于策略的强化学习方法 Actor-Critic进行学习。同时用体素来表示容器和多面体,从而简化状态信息的表达,并用神经网络表示价值函数和策略函;为了解决状态信息长度以及动作空间可变的问题,采取遮罩的方法来屏蔽部分输入和输出,并且引入LSTM来处理变长的状态信息。在5个不同的数据集进行的实验表明算法能够取得较好的结果。 In areas such as 3D printing and express logistics, irregular packing results from the need to place parts or goods of different shapes in a defined space. A placement solution could be put forward, allowing as many polyhedra as possible to fit into a given container, or a batch of objects could be placed so closely together that they occupy the smallest volume, which is known as the irregular packing problem. This is an NP problem but is difficult to solve efficiently. This paper undertook the following investigation: placing a given set of polyhedra inside a 3D container with a variable dimension, so that the variable dimension of the packed container could be minimized. We proposed a reinforcement learning based algorithm, AC-HAPE3D. This algorithm could model the problem into a Markov process using the heuristic algorithm HAPE3D, and then utilize the policy-based reinforcement learning method Actor-Critic. We simplified the representation of state information by using voxels to represent containers and polyhedra, and employed neural networks to represent value and policy functions;to address the problem of variable length of state information as well as action space, we adopted a masking approach to masking some of the inputs and outputs, and introduced LSTM to handle variable length of state information. Experiments conducted on five different datasets show that the algorithm can yield good results.

作者朱鹏辉袁宏涛聂勇伟李桂清 ZHU Peng-hui;YUAN Hong-tao;NIE Yong-wei;LI Gui-qing(School of Computer Science and Engineering,South China University of Technology,Guangzhou Guangdong 510006,China)

机构地区华南理工大学计算机科学与工程学院

出处《图学学报》 CSCD 北大核心 2022年第6期1096-1103,共8页 Journal of Graphics

关键词异形填充启发式算法体素强化学习三维打印 irregular packing heuristic algorithm voxel reinforcement learning 3-dimensional printing

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1Xiao LIU,Jia-min LIU,An-xi CAO,Zhuang-le YAO.HAPE3D—a new constructive algorithm for the 3D irregular packing problem[J].Frontiers of Information Technology & Electronic Engineering,2015,16(5):380-390. 被引量：4

共引文献3

1刘虓,徐磊,陈超核,刘嘉敏.改进三维不规则排样构造算法[J].计算机集成制造系统,2021,27(1):165-171. 被引量：2
2WU JianHua,ZHANG HaoDong,CHANG YaFei,XIONG ZhenHua,ZHU XiangYang.Novel objects 3-D dense packing through robotic pushing[J].Science China(Technological Sciences),2022,65(12):2942-2951.
3杨伟东,刘志越,王媛媛,朱东彬,张争艳.3DP工艺中STL模型中心排样策略研究[J].机械工程学报,2023,59(1):249-258. 被引量：2

1林宝珠.系统功能认知视阈融合的生态话语分析[J].武夷学院学报,2021,40(10):45-50. 被引量：1
2罗嫚玲,林海,刘威.低层人工拣货仓库货位优化问题研究[J].计算机工程与科学,2022,44(10):1832-1843. 被引量：3
3董海玲,唐娟,肖地长.带马尔可夫跳和可变迟滞的非线性耦合神经网络同步问题[J].应用概率统计,2022,38(6):836-846. 被引量：1
4关至轩,吴笛,刘玲娟.一种基于马氏过程与一维Logistic映射的图像加密方法[J].遵义师范学院学报,2022,24(6):90-94.
5熊明强,胡文力,谯杰,夏芹,张强,江萌.车路协同条件下智能网联汽车一体化决策模型[J].汽车工程学报,2022,12(6):793-802.
6陈李寿,吴宁,胡峰铭,谢贤,唐家宾.输电线路磁感应取能装置的参数特性与优化设计研究[J].广西电力,2022,45(3):31-35.
7景书杰,段晓辉,牛海峰.求解非线性全局优化问题的填充函数算法[J].河南理工大学学报（自然科学版）,2022,41(6):169-173.
8王忠立,王浩,申艳,蔡伯根.一种多感知多约束奖励机制的驾驶策略学习方法[J].吉林大学学报（工学版）,2022,52(11):2718-2727. 被引量：5
9程硕,帕孜来·马合木提.基于键合图和动态贝叶斯网络的逆变器系统可靠性研究[J].三峡大学学报（自然科学版）,2022,44(6):101-107. 被引量：2

图学学报

2022年第6期

浏览历史

内容加载中请稍等...

AC-HAPE3D:基于强化学习的异形填充算法

参考文献1

共引文献3

相关作者

相关机构

相关主题

浏览历史