摘要
研究了基于模仿强化学习的飞机姿态控制器。首先,建立专家经验数据集,并利用行为克隆对控制网络参数初始化;而后,控制网络利用强化学习和监督学习混合模式训练,通过奖励函数塑形和经验数据集监督学习引导强化学习算法快速收敛,使姿态控制器姿态响应优化的同时符合专家经验。控制网络输入为飞机姿态角误差、角速度等状态变量,输出控制增稳系统指令。实验表明,模仿强化学习控制器能够实现不同初始条件下飞机姿态角快速响应并与经验数据相符。
An attitude controller for fixed-wing aircraft based on Imitation Reinforcement Learning(IRL)is presented.Firstly,the empirical data set is built and the control network parameters are initialized with behavior cloning.Then,the control network is trained in hybrid mode combining Reinforcement Learning and Supervised Learning by reward shaping and supervised learning of empirical data,so that the attitude response of the attitude controller is optimized while conforming to expert experience.The control network inputs the state variables such as aircraft attitude angle error and angular velocity,and the outputs are the control commands of the inner control augmentation system(CAS).The simulation results show that the IRL controller achieves rapid response to aircraft attitude angles under different initial conditions and matches with empirical data.
作者
付宇鹏
邓向阳
朱子强
方君
余应福
闫文君
张立民
FU Yupeng;DENG Xiangyang;ZHU Ziqiang;FANG Jun;YU Yingfu;YAN Wenjun;ZHANG Limin(Naval Aviation University,Yantai Shandong 264001,China)
出处
《海军航空大学学报》
2022年第5期393-399,共7页
Journal of Naval Aviation University
关键词
行为克隆
强化学习
姿态控制
behavior cloning
Reinforcement Learning
attitude controller