In recent years,reinforcement learning(RL)has emerged as a solution for model-free dynamic programming problem that cannot be effectively solved by traditional optimization methods.It has gradually been applied in the...In recent years,reinforcement learning(RL)has emerged as a solution for model-free dynamic programming problem that cannot be effectively solved by traditional optimization methods.It has gradually been applied in the fields such as economic dispatch of power systems due to its strong selflearning and self-optimizing capabilities.However,existing economic scheduling methods based on RL ignore security risks that the agent may bring during exploration,which poses a risk of issuing instructions that threaten the safe operation of power system.Therefore,we propose an improved proximal policy optimization algorithm for sequential security-constrained optimal power flow(SCOPF)based on expert knowledge and safety layer to determine active power dispatch strategy,voltage optimization scheme of the units,and charging/discharging dispatch of energy storage systems.The expert experience is introduced to improve the ability to enforce constraints such as power balance in training process while guiding agent to effectively improve the utilization rate of renewable energy.Additionally,to avoid line overload,we add a safety layer at the end of the policy network by introducing transmission constraints to avoid dangerous actions and tackle sequential SCOPF problem.Simulation results on an improved IEEE 118-bus system verify the effectiveness of the proposed algorithm.展开更多
随着电网运行对电压控制和无功管理水平要求的提高,自动电压控制(automatic voltage control,AVC)日渐成为研究的热点。由于管理模式的制约,其应用性研究在北美电网仍属空白。文中设计并实现了适用于美国东北部某互联电网特殊管理模式...随着电网运行对电压控制和无功管理水平要求的提高,自动电压控制(automatic voltage control,AVC)日渐成为研究的热点。由于管理模式的制约,其应用性研究在北美电网仍属空白。文中设计并实现了适用于美国东北部某互联电网特殊管理模式和在线运行要求的自动电压控制系统。该系统扩展了移相器元件模型的处理,并在利用最优潮流(optimal power flow,OPF)计算控制策略的过程中考虑预想故障后的静态安全约束。长时间的在线试运行数据及其评估结果表明,文中所设计和实现的自动电压控制系统应用于该电网可显著改善该电网的无功电压水平,提高电网的安全性和经济性。展开更多
基金supported in part by National Natural Science Foundation of China(No.52077076)in part by the National Key R&D Plan(No.2021YFB2601502)。
文摘In recent years,reinforcement learning(RL)has emerged as a solution for model-free dynamic programming problem that cannot be effectively solved by traditional optimization methods.It has gradually been applied in the fields such as economic dispatch of power systems due to its strong selflearning and self-optimizing capabilities.However,existing economic scheduling methods based on RL ignore security risks that the agent may bring during exploration,which poses a risk of issuing instructions that threaten the safe operation of power system.Therefore,we propose an improved proximal policy optimization algorithm for sequential security-constrained optimal power flow(SCOPF)based on expert knowledge and safety layer to determine active power dispatch strategy,voltage optimization scheme of the units,and charging/discharging dispatch of energy storage systems.The expert experience is introduced to improve the ability to enforce constraints such as power balance in training process while guiding agent to effectively improve the utilization rate of renewable energy.Additionally,to avoid line overload,we add a safety layer at the end of the policy network by introducing transmission constraints to avoid dangerous actions and tackle sequential SCOPF problem.Simulation results on an improved IEEE 118-bus system verify the effectiveness of the proposed algorithm.
文摘随着电网运行对电压控制和无功管理水平要求的提高,自动电压控制(automatic voltage control,AVC)日渐成为研究的热点。由于管理模式的制约,其应用性研究在北美电网仍属空白。文中设计并实现了适用于美国东北部某互联电网特殊管理模式和在线运行要求的自动电压控制系统。该系统扩展了移相器元件模型的处理,并在利用最优潮流(optimal power flow,OPF)计算控制策略的过程中考虑预想故障后的静态安全约束。长时间的在线试运行数据及其评估结果表明,文中所设计和实现的自动电压控制系统应用于该电网可显著改善该电网的无功电压水平,提高电网的安全性和经济性。
基金Supported by 2015 Science and Technology Project of China Southern Power Grid(WYKJ00000027)supported in part by Faculty of Engineering&Information Technologies,the University of Sydney,under the Faculty Research Cluster Program and the Early Career Researcher Development Scheme,respectively~~