摘要
应用经典强化学习方法的控制性能标准(control performance standard,CPS)下自动发电控制(automatic generation control,AGC)指令(CPS指令)由调度端至电网各台机组的分配过程不可避免出现维数灾难问题。提出应用分层强化学习的方法,将全网机组按调频时延做初次分类,CPS指令逐层分配形成任务分层结构。在分层Q学习算法层与层之间引入一个时变协调因子,改进的分层Q学习算法有效提高原算法收敛速度。奖励函数中设计不同的权值线性组合,展示保守及乐观控制下系统CPS控制水平和调节成本的变化关系。南方电网统计性仿真分析表明,改进分层Q学习算法较分层Q学习算法平均收敛时间缩短47%,在复杂随机扰动的环境中改进算法能有效提高系统CPS考核合格率,并降低调节成本约5%。
This paper presented an improved hierarchical reinforcement learning (HRL) algorithm to solve the curse of dimensionality problem in the multi-objective dynamic optimization of automatic generation control (AGC) order dispatch based on control performance standard (CPS), The CPS order dispatch task was decomposed into several subtasks by classifying the AGC committed units according to their response time delay of power regulatng. A time-va~'ing coordination factor was introduced between layers of HRL to speed up the algorithm. Numbers of linear combination of weights in reward function were designed to optimize hydro capacity margin and AGC production cost. The application of improved hierarchical Q-learning in the China southern power grid model shows that the proposed method can speed up the algorithm by 47%, enhance the performance of AGC systems in CPS assessment, and save AGC production cost over 5%, compared with the hierarchical Q-learning and genetic algorithm.
出处
《中国电机工程学报》
EI
CSCD
北大核心
2011年第19期90-96,共7页
Proceedings of the CSEE
基金
国家自然科学基金项目(50807016)
广东省自然科学基金项目(9151064101000049)
中央高校基本科研业务费专项资金(2009ZM0251)~~
关键词
分层强化学习
协调因子
随机优化
控制性能标准
自动发电控制
hierarchical reinforcement learning (HRL)
coordination factor
stochastic optimization
control performance standard (CPS)
automatic generation control (AGC)