Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experi...Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.展开更多
Modern seismic sensors are capable of recording high precision vibration data continuously for several months. Seismic raw data consists of information regarding earthquake’s origin time, location, wave velocity, etc...Modern seismic sensors are capable of recording high precision vibration data continuously for several months. Seismic raw data consists of information regarding earthquake’s origin time, location, wave velocity, etc.Currently, these high volume data are gathered manually from each station for analysis. This process restricts us from obtaining high-resolution images in real-time. A new in-network distributed method is required that can obtain a high-resolution seismic tomography in real time. In this paper, we present a distributed multigrid solution to reconstruct seismic image over large dense networks. The algorithm performs in-network computation on large seismic samples and avoids expensive data collection and centralized computation. Our evaluation using synthetic data shows that the proposed method accelerates the convergence and reduces the number of messages exchanged. The distributed scheme balances the computation load and is also tolerant to severe packet loss.展开更多
基金National Natural Science Foundation of China,Grant/Award Number:61872171The Belt and Road Special Foundation of the State Key Laboratory of Hydrology‐Water Resources and Hydraulic Engineering,Grant/Award Number:2021490811。
文摘Multi‐agent reinforcement learning relies on reward signals to guide the policy networks of individual agents.However,in high‐dimensional continuous spaces,the non‐stationary environment can provide outdated experiences that hinder convergence,resulting in ineffective training performance for multi‐agent systems.To tackle this issue,a novel reinforcement learning scheme,Mutual Information Oriented Deep Skill Chaining(MioDSC),is proposed that generates an optimised cooperative policy by incorporating intrinsic rewards based on mutual information to improve exploration efficiency.These rewards encourage agents to diversify their learning process by engaging in actions that increase the mutual information between their actions and the environment state.In addition,MioDSC can generate cooperative policies using the options framework,allowing agents to learn and reuse complex action sequences and accelerating the convergence speed of multi‐agent learning.MioDSC was evaluated in the multi‐agent particle environment and the StarCraft multi‐agent challenge at varying difficulty levels.The experimental results demonstrate that MioDSC outperforms state‐of‐the‐art methods and is robust across various multi‐agent system tasks with high stability.
基金supported by the National Natural Science Foundation of China(61202369)the NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Informatization(U1509219)
文摘Modern seismic sensors are capable of recording high precision vibration data continuously for several months. Seismic raw data consists of information regarding earthquake’s origin time, location, wave velocity, etc.Currently, these high volume data are gathered manually from each station for analysis. This process restricts us from obtaining high-resolution images in real-time. A new in-network distributed method is required that can obtain a high-resolution seismic tomography in real time. In this paper, we present a distributed multigrid solution to reconstruct seismic image over large dense networks. The algorithm performs in-network computation on large seismic samples and avoids expensive data collection and centralized computation. Our evaluation using synthetic data shows that the proposed method accelerates the convergence and reduces the number of messages exchanged. The distributed scheme balances the computation load and is also tolerant to severe packet loss.