The increasing use of renewable energy in the power system results in strong stochastic disturbances and degrades the control performance of the distributed power grids.In this paper,a novel multi-agent collaborative ...The increasing use of renewable energy in the power system results in strong stochastic disturbances and degrades the control performance of the distributed power grids.In this paper,a novel multi-agent collaborative reinforcement learning algorithm is proposed with automatic optimization,namely,Dyna-DQL,to quickly achieve an optimal coordination solution for the multi-area distributed power grids.The proposed Dyna framework is combined with double Q-learning to collect and store the environmental samples.This can iteratively update the agents through buffer replay and real-time data.Thus the environmental data can be fully used to enhance the learning speed of the agents.This mitigates the negative impact of heavy stochastic disturbances caused by the integration of renewable energy on the control performance.Simulations are conducted on two different models to validate the effectiveness of the proposed algorithm.The results demonstrate that the proposed Dyna-DQL algorithm exhibits superior stability and robustness compared to other reinforcement learning algorithms.展开更多
基金supported by the National Natural Sci-ence Foundation of China(No.52277108)Guangdong Provincial Department of Science and Technology(No.2022A0505020015).
文摘The increasing use of renewable energy in the power system results in strong stochastic disturbances and degrades the control performance of the distributed power grids.In this paper,a novel multi-agent collaborative reinforcement learning algorithm is proposed with automatic optimization,namely,Dyna-DQL,to quickly achieve an optimal coordination solution for the multi-area distributed power grids.The proposed Dyna framework is combined with double Q-learning to collect and store the environmental samples.This can iteratively update the agents through buffer replay and real-time data.Thus the environmental data can be fully used to enhance the learning speed of the agents.This mitigates the negative impact of heavy stochastic disturbances caused by the integration of renewable energy on the control performance.Simulations are conducted on two different models to validate the effectiveness of the proposed algorithm.The results demonstrate that the proposed Dyna-DQL algorithm exhibits superior stability and robustness compared to other reinforcement learning algorithms.