Reinforcement Learning(RL)algorithms enhance intelligence of air combat AutonomousManeuver Decision(AMD)policy,but they may underperform in target combat environmentswith disturbances.To enhance the robustness of the ...Reinforcement Learning(RL)algorithms enhance intelligence of air combat AutonomousManeuver Decision(AMD)policy,but they may underperform in target combat environmentswith disturbances.To enhance the robustness of the AMD strategy learned by RL,thisstudy proposes a Tube-based Robust RL(TRRL)method.First,this study introduces a tube todescribe reachable trajectories under disturbances,formulates a method for calculating tubes basedon sum-of-squares programming,and proposes the TRRL algorithm that enhances robustness byutilizing tube size as a quantitative indicator.Second,this study introduces offline techniques forregressing the tube size function and establishing a tube library before policy learning,aiming toeliminate complex online tube solving and reduce the computational burden during training.Furthermore,an analysis of the tube library demonstrates that the mitigated AMD strategy achievesgreater robustness,as smaller tube sizes correspond to more cautious actions.This finding highlightsthat TRRL enhances robustness by promoting a conservative policy.To effectively balanceaggressiveness and robustness,the proposed TRRL algorithm introduces a“laziness factor”as aweight of robustness.Finally,combat simulations in an environment with disturbances confirm thatthe AMD policy learned by the TRRL algorithm exhibits superior air combat performance comparedto selected robust RL baselines.展开更多
Optimal voltage controls have been widely applied in wind farms to maintain voltage stability of power grids.In order to achieve optimal voltage operation,authentic grid information is widely needed in the sensing and...Optimal voltage controls have been widely applied in wind farms to maintain voltage stability of power grids.In order to achieve optimal voltage operation,authentic grid information is widely needed in the sensing and actuating processes.However,this may induce system vulnerable to malicious cyber-attacks.To this end,a tube model predictive control-based cyber-attack-resilient optimal voltage control method is proposed to achieve voltage stability against malicious cyber-attacks.The proposed method consists of two cascaded model predictive controllers(MPC),which outperform other peer control methods in effective alleviation of adverse effects from cyber-attacks on actuators and sensors of the system.Finally,efficiency of the proposed method is evaluated in sensor and actuator cyber-attack cases based on a modified IEEE 14 buses system and IEEE 118 buses system.Index Terms-Attack-resilient control,optimal voltage control,tube-based model predictive control,wind farm-connected power system.展开更多
In this paper,the optimal tracking control for robotic manipulators with state constraints and uncertain dynamics is investigated,and a sliding mode-based adaptive tube model predictive control method is proposed.Firs...In this paper,the optimal tracking control for robotic manipulators with state constraints and uncertain dynamics is investigated,and a sliding mode-based adaptive tube model predictive control method is proposed.First,utilizing the high-order fully actuated system approach,the nominal model of the robotic manipulator is constructed as the predictive model.Based on the nominal model,a nominal model predictive controller with the sliding mode is designed,which relaxes the terminal constraints,and realizes the accurate and stable tracking of the desired trajectory by the nominal system.Then,an auxiliary controller based on the node-adaptive neural networks is constructed to dynamically compensate nonlinear uncertain dynamics of the robotic manipulator.Furthermore,the estimation deviation between the nominal and actual states is limited to the tube invariant sets.At the same time,the recursive feasibility of nominal model predictive control is verified,and the ultimately uniformly boundedness of all variables is proved according to the Lyapunov theorem.Finally,experiments show that the robotic manipulator can achieve fast and efficient trajectory tracking under the action of the proposed method.展开更多
文摘Reinforcement Learning(RL)algorithms enhance intelligence of air combat AutonomousManeuver Decision(AMD)policy,but they may underperform in target combat environmentswith disturbances.To enhance the robustness of the AMD strategy learned by RL,thisstudy proposes a Tube-based Robust RL(TRRL)method.First,this study introduces a tube todescribe reachable trajectories under disturbances,formulates a method for calculating tubes basedon sum-of-squares programming,and proposes the TRRL algorithm that enhances robustness byutilizing tube size as a quantitative indicator.Second,this study introduces offline techniques forregressing the tube size function and establishing a tube library before policy learning,aiming toeliminate complex online tube solving and reduce the computational burden during training.Furthermore,an analysis of the tube library demonstrates that the mitigated AMD strategy achievesgreater robustness,as smaller tube sizes correspond to more cautious actions.This finding highlightsthat TRRL enhances robustness by promoting a conservative policy.To effectively balanceaggressiveness and robustness,the proposed TRRL algorithm introduces a“laziness factor”as aweight of robustness.Finally,combat simulations in an environment with disturbances confirm thatthe AMD policy learned by the TRRL algorithm exhibits superior air combat performance comparedto selected robust RL baselines.
基金supported by the National Natural Science Foundation of China(U1909201)the Hong Kong Polytechnic University Research Program(SB2D).
文摘Optimal voltage controls have been widely applied in wind farms to maintain voltage stability of power grids.In order to achieve optimal voltage operation,authentic grid information is widely needed in the sensing and actuating processes.However,this may induce system vulnerable to malicious cyber-attacks.To this end,a tube model predictive control-based cyber-attack-resilient optimal voltage control method is proposed to achieve voltage stability against malicious cyber-attacks.The proposed method consists of two cascaded model predictive controllers(MPC),which outperform other peer control methods in effective alleviation of adverse effects from cyber-attacks on actuators and sensors of the system.Finally,efficiency of the proposed method is evaluated in sensor and actuator cyber-attack cases based on a modified IEEE 14 buses system and IEEE 118 buses system.Index Terms-Attack-resilient control,optimal voltage control,tube-based model predictive control,wind farm-connected power system.
文摘In this paper,the optimal tracking control for robotic manipulators with state constraints and uncertain dynamics is investigated,and a sliding mode-based adaptive tube model predictive control method is proposed.First,utilizing the high-order fully actuated system approach,the nominal model of the robotic manipulator is constructed as the predictive model.Based on the nominal model,a nominal model predictive controller with the sliding mode is designed,which relaxes the terminal constraints,and realizes the accurate and stable tracking of the desired trajectory by the nominal system.Then,an auxiliary controller based on the node-adaptive neural networks is constructed to dynamically compensate nonlinear uncertain dynamics of the robotic manipulator.Furthermore,the estimation deviation between the nominal and actual states is limited to the tube invariant sets.At the same time,the recursive feasibility of nominal model predictive control is verified,and the ultimately uniformly boundedness of all variables is proved according to the Lyapunov theorem.Finally,experiments show that the robotic manipulator can achieve fast and efficient trajectory tracking under the action of the proposed method.