Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experi...Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.展开更多
Dear Editor, As a promising multi-agent systems(MASs) operation, autonomous interception has attracted more and more attentions in these years, where defenders prevent intruders from reaching destinations.So far, most...Dear Editor, As a promising multi-agent systems(MASs) operation, autonomous interception has attracted more and more attentions in these years, where defenders prevent intruders from reaching destinations.So far, most of the relevant methods are applied in ideal environments without agent damages. As a remedy, this letter proposes a more realistic interception method for MASs suffered by damages.展开更多
A Multi-Agent System ( MAS ) is a promising approach to build complex system. This paper introduces the research of the Inner-Enterprise Credit Rating MAS ( IECRMAS). To raise the rating accuracy, we not only cons...A Multi-Agent System ( MAS ) is a promising approach to build complex system. This paper introduces the research of the Inner-Enterprise Credit Rating MAS ( IECRMAS). To raise the rating accuracy, we not only consider the rating-target's information, but also focus on the evaluators' feature information and propose the rational rating-group formation algorithm based on an anti-bias measurement of the group. We also propose the rational rating individual, which consists of the evaluator and the assistant rating agent. A rational group formation protocol is designed to coordinate autonomous agents to perform the rating job.展开更多
基金National Natural Science Foundation of China,Grant/Award Number:61872171The Belt and Road Special Foundation of the State Key Laboratory of Hydrology‐Water Resources and Hydraulic Engineering,Grant/Award Number:2021490811。
文摘Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.
基金supported by the Science and Technology Project of State Grid Corporation of China, China (5100202199557A-0-5-ZN)。
文摘Dear Editor, As a promising multi-agent systems(MASs) operation, autonomous interception has attracted more and more attentions in these years, where defenders prevent intruders from reaching destinations.So far, most of the relevant methods are applied in ideal environments without agent damages. As a remedy, this letter proposes a more realistic interception method for MASs suffered by damages.
基金This paper is supported by National Science Foundation of China under Grant No60542004
文摘A Multi-Agent System ( MAS ) is a promising approach to build complex system. This paper introduces the research of the Inner-Enterprise Credit Rating MAS ( IECRMAS). To raise the rating accuracy, we not only consider the rating-target's information, but also focus on the evaluators' feature information and propose the rational rating-group formation algorithm based on an anti-bias measurement of the group. We also propose the rational rating individual, which consists of the evaluator and the assistant rating agent. A rational group formation protocol is designed to coordinate autonomous agents to perform the rating job.