摘要
针对选择性催化还原(Selective catalytic reduction,SCR)脱硝系统延迟大、扰动多等特点,提出了一种基于改进双延迟深度确定性策略梯度(Twin delayed deep deterministic policy gradient,TD3)的SCR脱硝系统复合控制策略。首先,提出了一种融合多步时序差分(Muti-step temporal-difference,MSTD)和优先经验回放(Prioritized experience replay,PER)的改进TD3算法。该算法在策略更新时使用MSTD计算回报,同时利用PER选择重要的经验进行学习,以此提高TD3算法的策略学习能力并加速算法的学习过程。其次,通过设计多维状态观测,综合考虑SCR脱硝系统的前馈信号和验证反馈信号来实现SCR脱硝系统的复合控制,进而维持出口NOx浓度的稳定性。最后,进行仿真实验验证,结果表明基于MSTD-PER-TD3算法的复合控制策略能更有效地克服入口NOx浓度波动对出口NOx浓度的影响,并具有优秀的抗干扰能力和鲁棒性。
Aiming at the characteristics of large delay and many disturbances in selective catalytic reduction(SCR)denitrification system,a composite control strategy of SCR denitrification system based on improved twin delayed deep deterministic policy gradient(TD3)is proposed.Firstly,an improved TD3 algorithm combining multi-step temporal-difference(MSTD)and prioritized experience replay(PER)is proposed.The algorithm uses MSTD to calculate the reward when updating the strategy,and uses PER to select important experience for learning,so as to improve the strategy learning ability of TD3 algorithm and accelerate the learning process of the algorithm.Secondly,by designing multi-dimensional state observation,the feedforward signal and the validating feedback signal of the SCR denitrification system are comprehensively considered to realize the composite control of the SCR denitrification system,thereby maintaining the stability NOx concentration at the outlet.Finally,simulation experiments are conducted to verify that the composite control strategy based on MSTD-PER-TD3 algorithm can more effectively overcome the influence of inlet NOx concentration fluctuation on outlet NOx concentration,and has excellent anti-interference and robustness.
作者
赵征
全家乐
刘子涵
ZHAO Zheng;QUAN Jiale;LIU Zihan(School of Control and Computer Engineering,North China Electric Power University,Baoding 071003,China)
出处
《电力科学与工程》
2024年第11期70-78,共9页
Electric Power Science and Engineering
基金
深圳市科技计划资助项目(KCXFZ20201221173402007)。
关键词
TD3算法
多步时序差分
优先经验回放
SCR脱硝系统
复合控制策略
TD3 algorithm
multi-step timing difference
priority experience playback
SCR denitrification system
composite control strategy