基于改进深度强化学习的SCR脱硝系统复合控制研究

Research on Composite Control of SCR Denitrification System Based on Improved Deep Reinforcement Learning

下载PDF

导出

摘要针对选择性催化还原(Selective catalytic reduction,SCR)脱硝系统延迟大、扰动多等特点,提出了一种基于改进双延迟深度确定性策略梯度(Twin delayed deep deterministic policy gradient,TD3)的SCR脱硝系统复合控制策略。首先,提出了一种融合多步时序差分(Muti-step temporal-difference,MSTD)和优先经验回放(Prioritized experience replay,PER)的改进TD3算法。该算法在策略更新时使用MSTD计算回报,同时利用PER选择重要的经验进行学习,以此提高TD3算法的策略学习能力并加速算法的学习过程。其次,通过设计多维状态观测,综合考虑SCR脱硝系统的前馈信号和验证反馈信号来实现SCR脱硝系统的复合控制,进而维持出口NOx浓度的稳定性。最后,进行仿真实验验证,结果表明基于MSTD-PER-TD3算法的复合控制策略能更有效地克服入口NOx浓度波动对出口NOx浓度的影响,并具有优秀的抗干扰能力和鲁棒性。 Aiming at the characteristics of large delay and many disturbances in selective catalytic reduction(SCR)denitrification system,a composite control strategy of SCR denitrification system based on improved twin delayed deep deterministic policy gradient(TD3)is proposed.Firstly,an improved TD3 algorithm combining multi-step temporal-difference(MSTD)and prioritized experience replay(PER)is proposed.The algorithm uses MSTD to calculate the reward when updating the strategy,and uses PER to select important experience for learning,so as to improve the strategy learning ability of TD3 algorithm and accelerate the learning process of the algorithm.Secondly,by designing multi-dimensional state observation,the feedforward signal and the validating feedback signal of the SCR denitrification system are comprehensively considered to realize the composite control of the SCR denitrification system,thereby maintaining the stability NOx concentration at the outlet.Finally,simulation experiments are conducted to verify that the composite control strategy based on MSTD-PER-TD3 algorithm can more effectively overcome the influence of inlet NOx concentration fluctuation on outlet NOx concentration,and has excellent anti-interference and robustness.

作者赵征全家乐刘子涵 ZHAO Zheng;QUAN Jiale;LIU Zihan(School of Control and Computer Engineering,North China Electric Power University,Baoding 071003,China)

机构地区华北电力大学控制与计算机工程学院

出处《电力科学与工程》 2024年第11期70-78,共9页 Electric Power Science and Engineering

基金深圳市科技计划资助项目(KCXFZ20201221173402007)。

关键词 TD3算法多步时序差分优先经验回放 SCR脱硝系统复合控制策略 TD3 algorithm multi-step timing difference priority experience playback SCR denitrification system composite control strategy

分类号 TP273 [自动化与计算机技术—检测技术与自动化装置] TK224 [动力工程及工程热物理—动力机械及工程]

引文网络
相关文献

1王丰吉,王启业,王东,刘沛奇,孙海峰,蒋文.基于混合SPSS-PSO-SVM模型的SCR脱硝喷氨控制系统优化研究[J].电力设备管理,2024(19):139-143.
2孙明,邹浓茂,白阳振,徐文鑫.PEMFC热管理系统的改进偏差型自抗扰控制[J].太阳能学报,2024,45(10):68-76.
3樊民革,吴珈漉.基于FPGA与SVM实时双目视觉巡检机器人避障研究[J].机械工程与自动化,2024(6):56-57.
4虢成功,李杰.混凝土结构疲劳损伤全过程模拟的加速算法[J].同济大学学报（自然科学版）,2024,52(11):1649-1657.
5张继五,郝云晓,权龙,孙斌.电液伺服增压系统压力冲击抑制方法研究[J].机床与液压,2024,52(21):87-91.
6许堉坤,朱铮,陈海宾,甄昊涵.基于奇异阈值加速算法的时间低秩子空间聚类[J].计算机应用与软件,2024,41(10):325-334.
7牟雪涛,李振海.分子动力学模拟在核小体研究中的应用[J].上海大学学报（自然科学版）,2024,30(5):813-825.
8王永林,夏广臻,韩天顺,唐慎涛.基于SSA-LSTM-Attention算法的脱硝优化控制研究[J].中国科技期刊数据库工业A,2024(11):170-175.
9王志敏,黄骞,柳冠青,周勇,张楠,李诚,李水清.适应新型电力系统的调峰火电机组空气预热器安全评估策略[J].南方能源建设,2024,11(6):33-40.
10段雪松,张晋嘉,李宏博,李豫豪,牛森,张建雄.基于改进Louvain和BAAM-ADMM的配电网多目标集群划分和电压协调控制[J].智慧电力,2024,52(10):32-39.

电力科学与工程

2024年第11期

浏览历史

内容加载中请稍等...

基于改进深度强化学习的SCR脱硝系统复合控制研究

相关作者

相关机构

相关主题

浏览历史