摘要
电液伺服系统动态性能复杂多变,很难为其运动控制获得精确的动力学模型.本文以高精度电液伺服仿真模型作为研究对象,将电液伺服系统位置控制问题转化为强化学习中的状态稀疏奖励问题,使用基于强化学习的屏障函数安全控制方法进行控制器整定.相比传统控制方法,本文直接通过优化状态空间稀疏奖励与安全屏障辅助奖励实现基于数据的安全强化学习控制器整定,其预设安全性为强化学习控制方法实际应用于工业生产奠定了基础.结果表明,使用安全屏障辅助奖励项进行稀疏奖励优化保障算法收敛性的同时能有效实现稳态安全控制目标.在高精度电液伺服系统非线性多项式仿真模型的位置控制问题中证明了本文所提安全强化学习控制方法的有效性.
The complexity and the changeable nature in the dynamic performance of electro-hydraulic servo systems leads to difficulties of obtaining an accurate dynamic model for the motion control of these systems.In this study,we take the high-precision electro-hydraulic servo simulation model as the research object,turn the position control optimization problem into a state sparse reward problem in reinforcement learning(RL),and use a safety control method with barrier function based on RL used for controller tuning.Compared with traditional control methods,we directly optimize the state space sparse reward and barrier function to achieve data-bases safety RL controller tuning.Its preset safety lays the foundation for the practical application of RL control methods in industrial production.Results show that adding the barrier function to the state reward function can effectively optimize the stability and robustness of the controller while ensuring the convergence of the algorithm.The effectiveness of the safety reinforcement learning control method proposed herein is demonstrated in the displacement control of high-precision electro-hydraulic servo system nonlinear polynomial simulation model.
作者
唐逸凡
余臻
刘利军
TANG Yifan;YU Zhen;LIU Lijun(School of Aerospace Engineering,Xiamen University,Xiamen 361102,China;Shenzhen Research Institute of Xiamen University,Shenzhen 518057,China)
出处
《厦门大学学报(自然科学版)》
CAS
CSCD
北大核心
2022年第2期239-245,共7页
Journal of Xiamen University:Natural Science
基金
国家自然科学基金(61304110)
福建省自然科学基金(2020J01052)
深圳市基础研究项目(JCYJ20190809163009630)
中国航发研究院创新基金(HKCK2020-02-029)。
关键词
电液伺服系统
安全控制
强化学习
屏障函数
electro-hydraulic servo system
safety control
reinforcement learning
barrier function