摘要
随着广域测量系统在暂态稳定控制中的应用,广域信息的随机性时滞造成了系统受控时状态的不确定性,并且切机和切负荷控制的离散决策变量维度极高,电网在线紧急控制决策面临着挑战。为此,将暂态稳定紧急控制问题建模为马尔可夫决策问题,提出一种深度Q网络(DQN)强化学习与暂态能量函数相结合的紧急控制决策方法,多步序贯决策过程中可应对紧急控制的时滞不确定性影响。奖励函数以考虑控制目标和约束条件的短期奖励函数和考虑稳定性的长期奖励函数构成,并在奖励函数中引入暂态能量函数的势能指数来提高学习效率。以最大化累计奖励为目标,通过DQN算法在离散化动作空间中学习得到最优紧急控制策略,解决暂态稳定紧急控制问题。所提方法通过IEEE 39节点系统验证了模型在紧急控制决策中的有效性。
With the application of wide-area measurement systems in the transient stability control,the random time delay of widearea information during the control process may cause the uncertainty of power system state during control.Moreover,the dimension of discrete decision variables for machine tripping and load shedding is extremely high,and the online emergency control decision-making of the power grid is facing challenge.Therefore,the transient stability emergency control problem is modeled as a Markov decision problem,and an decision-making method combining the deep Q-learning network(DQN)reinforcement learning and transient energy function is proposed,which can deal with the time-delay uncertainty of emergency control through the multistep sequential decision-making process.The reward function is composed of a short-term reward function considering the control objectives and constraints,and a long-term reward function considering the stability.The potential energy index of the transient energy function is introduced into the reward function to improve the learning efficiency.With the objective of maximizing the cumulative rewards,the optimal emergency control strategy is learned in the discrete action space by DQN algorithm to solve the transient stability emergency control problem.The effectiveness of the proposed method in the emergency control decision-making is verified by an IEEE 39-bus system.
作者
李宏浩
张沛
刘曌
LI Honghao;ZHANG Pei;LIU Zhao(School of Electrical Engineering,Beijing Jiaotong University,Beijing 100044,China)
出处
《电力系统自动化》
EI
CSCD
北大核心
2023年第5期144-152,共9页
Automation of Electric Power Systems
基金
中央高校基本科研业务费专项资金资助项目(2021JBM027)
国家自然科学基金青年基金资助项目(52107068)。
关键词
深度强化学习
暂态稳定
紧急控制决策
暂态能量函数
深度Q网络(DQN)算法
时滞
deep reinforcement learning
transient stability
emergency control decision-making
transient energy function
deep Qlearning network(DQN)algorithm
time delay