摘要
针对发生故障的飞行控制系统,在强化学习算法的基础上,提出了一种基于增量式策略的强化学习容错方法.本方法利用传感器获取的系统状态值,根据系统预先设定的奖励函数对当前控制系统状况做出最优的决策并不断更新价值网络,将系统的容错控制过程转换为强化学习Agent的贯序决策过程,并使用一种改进型的增量式策略实现对当前故障的正确补偿策略的逐渐逼近.同时,针对连续控制系统,提出一种状态转移预测网络来得到下一步状态值.最后,通过南京航空航天大学“先进飞行器导航、控制与健康管理”工信部重点实验室的飞行器故障诊断实验平台验证了该方法的有效性.
A reinforcement learning method based on incremental strategy is proposed to make fault-tolerant tracking control for continuous flight control system with faults.The system state value obtained by the sensor is used in the method proposed by this paper,The fault-tolerant system makes optimal decisions on the current control system conditions based on pre-set reward functions and continuously updates the value network,This transforms the fault-tolerant control process of the system into a sequential decision-making process of the reinforcement learning agent,and gradually approximates the specific fault value using an improved incremental strategy.what’s more,A state transition prediction network is proposed for the continuous control system to obtain the next state value.Finally,The effectiveness of the proposed method is verified by the aircraft fault diagnosis experimental platform of the Key Laboratory of Advanced Aircraft Navigation,Control and Health Management of Nanjing University of Aeronautics and Astronautics.
作者
任坚
刘剑慰
杨蒲
REN Jian;LIU Jian-wei;YANG Pu(College of Automation Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing Jiangsu 211106,China)
出处
《控制理论与应用》
EI
CAS
CSCD
北大核心
2020年第7期1429-1438,共10页
Control Theory & Applications
基金
民航飞机健康监测与智能维护重点实验室基金项目(NJ2018012)
先进飞行器导航、控制与健康管理工业和信息化部重点实验室(南京航空航天大学)项目
中央高校基本科研业务费项目(NS2017017)
国家自然科学基金项目(61533008,61490703)资助。