期刊文献+

基于RG-DDPG的直流微网能量管理策略 被引量:1

Strategy for DC Microgrid Energy Management Based on RG-DDPG
下载PDF
导出
摘要 针对分布式能源的随机性和间歇性给直流微网能量管理带来的巨大挑战,提出一种基于奖励指导深度确定性策略梯度(reward guidance deep deterministic policy gradient,RG-DDPG)的直流微网能量管理策略。该策略将直流微网的优化运行描述为一个马尔科夫决策过程,利用智能体与直流微网环境间的持续交互,自适应地学习能量管理决策,实现直流微网能量的优化管理。在策略训练过程中,采用基于时序差分误差(temporal difference error,TD-error)的优先经验回放机制减少RG-DDPG在直流微网运行环境中学习、探索的随机性和盲目性,提升所提能量优化管理策略的收敛速度。同时,在训练回合间利用累计奖励的大小构造直流微网能量管理的优秀回合集,加强RG-DDPG智能体在训练回合间的联系,最大化利用优秀回合的训练价值。算例仿真结果表明:所提策略能够实现直流微网内能量的合理分配。相较于基于深度Q网络(deep Q-network,DQN)和粒子群算法(particle swarm optimization,PSO)的能量管理策略,所提策略能使直流微网日平均运行成本分别降低11.16%和7.10%。 The randomness and intermittency of distributed energy have brought great challenges to the energy management of direct current(DC)microgrids.In order to solve this challenge,a DC microgrid energy management strategy based on reward guidance deep deterministic policy gradient(RG-DDPG)is proposed in this paper.This strategy describes the optimal operation of the DC microgrid as a Markov decision process and uses the continuous interaction between the agent and the DC microgrid environment to adaptively learn energy management decisions,thus realizing the optimal management of the DC microgrid energy.In the strategy training process,the priority experience replay mechanism based on temporal difference error(TD-error)is used to reduce the randomness and blindness of RG-DDPG’s learning and exploration in the DC microgrid operating environment and improve the convergence speed of the energy optimization and management strategy proposed in this paper.At the same time,during the training rounds,the size of the accumulated rewards is used to construct an excellent round set of DC microgrid energy management,strengthen the connection between RG-DDPG agents in the training rounds,and maximize the use of the training value of the excellent round.The simulation results show that the proposed strategy can reasonably distribute energy in the DC microgrid.Compared with the energy management strategy based on deep Q learning(DQN)and particle swarm optimization(PSO),the proposed strategy can reduce the daily average operation cost of DC microgrids by 11.16%and 7.10%,respectively.
作者 李建标 陈建福 高滢 裴星宇 吴宏远 陆子凯 周少雄 曾杰 LI Jianbiao;CHEN Jianfu;GAO Ying;PEI Xingyu;WU Hongyuan;LU Zikai;ZHOU Shaoxiong;ZENG Jie(DC Power Distribution and Consumption Technology Research Centre of Guangdong Power Grid Co.,Ltd.,Zhuhai 519000,China;China Southern Power Grid Technology Co.,Ltd.,Guangzhou 510000,China;Qingke Youneng(Shenzhen)Technology Co.,Ltd.,Shenzhen 518000,China)
出处 《中国电力》 CSCD 北大核心 2023年第7期85-94,共10页 Electric Power
基金 中国南方电网有限责任公司科技项目(GDKJXM20212062)。
关键词 直流微网 能量管理 RG-DDPG 优先经验回放 优秀回合集 DC microgrid energy management RG-DDPG priority experience replay excellent round set
  • 相关文献

参考文献19

二级参考文献325

共引文献507

同被引文献20

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部