Actor-critic框架下的二次指派问题求解方法

Solving quadratic assignment problem based on actor-critic framework

下载PDF

导出

摘要二次指派问题(QAP)属于NP-hard组合优化问题,在现实生活中有着广泛应用。目前相对成熟的启发式算法通常以问题为导向来设计定制化算法,缺乏迁移泛化能力。为提供一个统一的QAP求解策略,将QAP问题的流量矩阵及距离矩阵抽象成两个无向完全图并构造相应的关联图,从而将设施和地点的指派任务转化为关联图上的节点选择任务,基于actor-critic框架,提出一种全新的求解算法ACQAP。首先,利用多头注意力机制构造策略网络,处理来自图卷积神经网络的节点表征向量;然后,通过actor-critic算法预测每个节点被作为最优节点输出的概率;最后,依据该概率在可行时间内输出满足目标奖励函数的动作决策序列。该算法摆脱人工设计,且适用于不同规模的输入,更加灵活可靠。实验结果表明,在QAPLIB实例上,本算法在精度媲美传统启发式算法的前提下,迁移泛化能力更强;同时相对于NGM等基于学习的算法,求解的指派费用与最优解之间的偏差最小,且在大部分实例中,偏差均小于20%。 s the flow matrix and distance matrix of QAP problem into two undirected complete graphs and constructs corresponding correlation graphs,thus transforming the assignment task of facilities and locations into node selection task on the association graph.Based on actor-critic framework,this paper proposes a new algorithm ACQAP(actor-critic for QAP).Firstly,the model uses a multi-headed attention mechanism to construct a policy network to process the node representation vectors from the graph convolutional neural network;Then,the actor-critic algorithm is used to predict the probability of each node being output as the optimal node.Finally,the model outputs an action decision sequence that satisfies the objective reward function within a feasible time.The algorithm is free from manual design and is more flexible and reliable as it is applicable to different sizes of inputs.The experimental results show that on QAPLIB instances,the algorithm has stronger transfer and generalization ability under the premise that the accuracy is comparable to the traditional heuristic algorithm,while the assignment cost for solving is less compared to the latest learning-based algorithms such as NGM,and the deviation is less than 20%in most instances.

作者李雪源韩丛英 LI Xueyuan;HAN Congying(School of Mathematical Sciences,University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区中国科学院大学数学科学学院

出处《中国科学院大学学报（中英文）》 CAS CSCD 北大核心 2024年第2期275-284,共10页 Journal of University of Chinese Academy of Sciences

基金国家重点研发计划专项(2021YFA1000403) 国家自然科学基金(11991022) 中国科学院战略性先导科技专项(XDA27000000)资助。

关键词二次指派问题图卷积神经网络深度强化学习多头注意力机制 actor-critic算法 quadratic assignment problem graph convolutional neural network deep reinforcement learning multi-head-attention mechanism actor-critic algorithm

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1姜波.非负正交约束优化问题的理论、算法及应用[J].运筹学学报,2023,27(4):136-152.
2王雪婷,王燕,袁凯.树的离心率矩阵的最大特征值[J].数学进展,2023,52(6):1013-1021.
3胡宝雨,高成,卢萌萌.共线运营条件下双源无轨电车多车型协调优化[J].交通运输研究,2023,9(6):21-33.
4左冰(),武刚,张晨,沙甫拉·努尔别克,林润基,尹俊尧,戴昊炜,伍姣.贸易协定对全球贸易流的影响研究:基于复杂网络视角[J].世界地理研究,2024,33(1):1-17.
5梁茂林,洪菊花,骆华松,彭邦文,熊琛然.亚太经合组织贸易网络结构时空演变及其影响因素研究[J].世界地理研究,2024,33(1):18-32. 被引量：2
6赵鹏程,于俊清,李冬.一种基于深度学习的SRv6网络流量调度优化算法[J].信息网络安全,2024(2):272-281.
7肖腾,王鑫,梅熙,叶志伟,颜青松,邓非.摄影测量局部场景稳健合并的并行式运动恢复结构方法[J].测绘学报,2024,53(2):332-343. 被引量：1
8杜利,王闻雅.德国肺癌个性化治疗的实施[J].中国医药导刊,2024,26(1):60-65.
9王旭,蔡远利,张学成,张荣良,韩成龙.基于分层强化学习的低过载比拦截制导律[J].空天防御,2024,7(1):40-47.
10王玲,金子琨,吴勇,耿海军.基于链路关联度模型的绿色节能路由框架[J].计算机科学,2024,51(3):289-299.

中国科学院大学学报（中英文）

2024年第2期

浏览历史

内容加载中请稍等...

Actor-critic框架下的二次指派问题求解方法

相关作者

相关机构

相关主题

浏览历史