摘要
以最小化分区内主导节点电压偏差和发电机无功出力比例的方差为目标,建立多目标协调二级电压控制模型,可协调变电站容抗器与发电机自动电压调节器的动作。针对其控制特点和在线优化的要求,提出一种简化强化学习求解方法。为了加快奖励值的传播速度,该方法定义了新的状态函数,并在主循环之前利用全局搜索来实现初始值定位和状态空间的自主压缩,从而极大地提高搜索效率;在主循环的搜索过程中采用基于状态敏感度的自适应学习阶段划分准则,实现学习经验搜索与利用的平衡;将单次动作的变量选择范围扩大到所有控制变量,使得在有限循环次数下的搜索尽可能覆盖到整个状态空间。为了反映系统的当前偏好信息,引入实时权重系数的概念,并在求得帕累托前沿后根据实时权重选出最优控制。算例分析分别从帕累托前沿质量、优化时间、收敛率以及实时权重的控制效果四个方面验证了简化强化学习方法和实时权重系数的优越性。
With the objective of minimizing the voltage deviation of the dominant node and the variance of generator reactive power output proportions in partition, this paper establish the multi-objective coordinated secondary voltage control (MOCSVC) model, which can coordinate the action of capacitors/reactors in substations and automatic voltage regulator (AVR). According to the control features of MOCSVC as well as the requirements of online optimization, this paper presents a new method for solving MOCSVC, called state sensitivity based reduced reinforcement learning (SSRRL). In order to accelerate the propagation speed of the award value, SSRRL proposes a new definition of the state function, and achieves the initial point positioning and autonomous compression of the state space through global search before the main loop, greatly improving the search efficiency. Moreover, SSRRL use the adaptive criteria of learning phase division based on state sensitivity during the main loop search, balancing the search and the use of the learning experience, and take the action selection mechanism which extend the variable selection range of single action to all control variables, making the search in a limited cycle number to cover the entire state space as much as possible. Besides, in order to reflect the current preference information of system, this paper introduce the concept of real-time weight coefficient, and select the optimal control from the Pareto frontier (PF) according to it. The example analysis validates the superiority of the SSRRL and the real-time weighting coefficient from four aspects including quality of PF, optimization time, convergence rate and control effect.
出处
《中国电机工程学报》
EI
CSCD
北大核心
2013年第31期130-139,16,共10页
Proceedings of the CSEE
基金
国家自然科学基金项目(51277078)
广东省绿色能源技术重点实验室资助项目(2008A060301002)
广东省教育部产学研结合项目(2010A090200065)~~
关键词
多目标协调二级电压控制
强化学习
实时权重
帕累托前沿
状态敏感度
multi-objective coordinated secondary voltage control (MOCSVC)
reinforcement learning
real-time weighting coefficient
pareto frontier
state sensitivity