摘要
针对选择性催化还原(SCR)脱硝系统存在多扰动、大迟延、大惯性等特点,提出基于改进的双延迟深度确定性策略梯度(ITD3)算法、比例积分(PI)控制器和扰动观测器(DOB)的一种ITD3-PI复合串级控制算法。首先,借鉴PID控制思想提出了一种ITD3算法,通过对出口NO;质量浓度设定值与测量值之间的误差进行微分和积分运算生成新的环境状态,并将新的状态、测量值和误差同时储存于经验池中;然后,利用DOB来估计脱硝过程的扰动,并进行前馈补偿;最后,对ITD3-PI复合串级控制与TD3-PI复合串级控制、复合串级PID控制和串级PID控制进行对比实验。结果表明:所提方法控制速度快、超调量小、抗干扰能力强,为强化学习在SCR脱硝系统中的应用提供了新的思路。
Aiming at the characteristics of strong disturbance,large delay,large inertia in a selective catalytic reduction(SCR)denitration system,an ITD3-PI composite cascade control algorithm was proposed based on an improved dual-delay depth deterministic strategy gradient(ITD3)algorithm,a PI controller and a disturbance observer(DOB).Firstly,an ITD3 algorithm was proposed according to the PID control scheme,it generated new environmental states by performing differential and integral operations on the errors between the set values and the measured values of the outlet NO;mass concentration,then stored the new states,the measured values and the errors in the experience pool.Secondly,the DOB was used to estimate the disturbance of a denitration process and performed a feedforward compensation.Finally,the proposed ITD3-PI composite cascade control was compared with a TD3-PI composite cascade control,a composite cascade PID control and a cascade PID control.Results show that the proposed method has good performances of fast control speed,small overshoot and strong anti-interference ability,which provides a new thought for the applications of reinforcement learning in a SCR denitration system.
作者
陈皓炜
贾新春
孙小明
侯鹏飞
CHEN Haowei;JIA Xinchun;SUN Xiaoming;HOU Pengfei(School of Automation and Software Engineering,Shanxi University,Taiyuan 030013,China)
出处
《动力工程学报》
CAS
CSCD
北大核心
2022年第5期421-428,共8页
Journal of Chinese Society of Power Engineering
基金
国家自然科学基金资助项目(U1610116)
山西省重点研发资助项目(201903D121145)。
关键词
强化学习
SCR脱硝系统
复合串级控制
扰动观测器
氮氧化物排放
reinforcement learning
SCR denitration system
composite cascade control
disturbance observer
NOx emission