Offline Reinforcement Learning with Constrained Hybrid Action Implicit Representation Towards Wargaming Decision-Making

导出

摘要 Reinforcement Learning(RL)has emerged as a promising data-driven solution for wargaming decision-making.However,two domain challenges still exist:(1)dealing with discrete-continuous hybrid wargaming control and(2)accelerating RL deployment with rich offline data.Existing RL methods fail to handle these two issues simultaneously,thereby we propose a novel offline RL method targeting hybrid action space.A new constrained action representation technique is developed to build a bidirectional mapping between the original hybrid action space and a latent space in a semantically consistent way.This allows learning a continuous latent policy with offline RL with better exploration feasibility and scalability and reconstructing it back to a needed hybrid policy.Critically,a novel offline RL optimization objective with adaptively adjusted constraints is designed to balance the alleviation and generalization of out-of-distribution actions.Our method demonstrates superior performance and generality across different tasks,particularly in typical realistic wargaming scenarios.

作者 Liwei Dong Ni Li Guanghong Gong Xin Lin

机构地区 School of Automation Science and Electrical Engineering School of Automation Science and Electrical Engineering School of Automation Science and Electrical Engineering

出处《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第5期1422-1440,共19页 清华大学学报自然科学版（英文版）

关键词 offline Reinforcement Learning(RL) WARGAMING DECISION-MAKING hybrid action space

分类号 O17 [理学—基础数学]

Tsinghua Science and Technology

2024年第5期

浏览历史

内容加载中请稍等...

Offline Reinforcement Learning with Constrained Hybrid Action Implicit Representation Towards Wargaming Decision-Making

相关作者

相关机构

相关主题

浏览历史