期刊文献+

基于态势利导的需求响应自学习优化调度方法 被引量:1

Self-learning Optimal Scheduling Method of Demand Response Based on Situation Orientation
下载PDF
导出
摘要 针对多随机场景下用户可选择需求响应(CCR)的场景组合激增问题,利用深度强化学习算法实现CCR群组的优选及其所包含节点的优化调度。首先,根据CCR优化调度的约束条件与目标函数,分析其数学模型及日调度周期的求解复杂度;然后,基于马尔可夫决策过程将CCR优化调度过程映射至态势感知元组,并基于竞争深度Q网络架构建立态势利导函数,通过多次态势推演,利用小批量梯度下降法对态势利导函数求导,不断反馈更新算法参数,实现决策优化;最后,基于IEEE 33节点算例,通过不同规模的随机样本数量,在随机运行方式下实现了待选CCR群组的优选,并制定相应的优化调度策略。 Aiming at the scene combination surge problem of the consumer choice resource(CCR)in multiple stochastic scenarios,this paper uses the deep reinforcement learning algorithm to achieve the optimal selection of CCR groups and the optimal scheduling of the contained nodes.First,according to the constraint conditions and objective function of optimal scheduling for CCR,the mathematical model and the solution complexity of the daily scheduling cycle are analyzed.Then,the optimal scheduling process for CCR is mapped into the situation awareness tuple based on the Markov decision process,and the situation orientation function is established based on the architecture of the dueling deep Q network.Through multiple situation deductions,the situation orientation function is derived by using the small batch gradient descent method,and the algorithm parameters are continuously fed back and updated to realize the decision optimization.Finally,based on the IEEE 33-bus example,by using random number of samples with different sizes,the optimization of the CCR group to be selected is realized in the random operation mode,and the corresponding optimal scheduling strategy is formulated.
作者 明威宇 李妍 程时杰 龙禹 徐菁 王少荣 MING Weiyu;LI Yan;CHENG Shijie;LONG Yu;XU Jing;WANG Shaorong(State Key Laboratory of Advanced Electromagnetic Engineering and Technology,Huazhong University of Science and Technology,Wuhan 430074,China)
出处 《电力系统自动化》 EI CSCD 北大核心 2022年第23期109-116,共8页 Automation of Electric Power Systems
基金 国家重点研发计划智能电网技术与装备重点专项资助项目(2017YFB0902800)。
关键词 可选择需求响应 深度强化学习 竞争深度Q网络 马尔可夫决策过程 态势感知 态势利导 consumer choice resource(CCR) deep reinforcement learning(DRL) dueling deep Q network(DDQN) Markov decision process(MDP) situation awareness situation orientation
  • 相关文献

参考文献17

二级参考文献292

共引文献1330

同被引文献50

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部