Missile interception problem can be regarded as a two-person zero-sum differential games problem,which depends on the solution of Hamilton-Jacobi-Isaacs(HJI)equa-tion.It has been proved impossible to obtain a closed-f...Missile interception problem can be regarded as a two-person zero-sum differential games problem,which depends on the solution of Hamilton-Jacobi-Isaacs(HJI)equa-tion.It has been proved impossible to obtain a closed-form solu-tion due to the nonlinearity of HJI equation,and many iterative algorithms are proposed to solve the HJI equation.Simultane-ous policy updating algorithm(SPUA)is an effective algorithm for solving HJI equation,but it is an on-policy integral reinforce-ment learning(IRL).For online implementation of SPUA,the dis-turbance signals need to be adjustable,which is unrealistic.In this paper,an off-policy IRL algorithm based on SPUA is pro-posed without making use of any knowledge of the systems dynamics.Then,a neural-network based online adaptive critic implementation scheme of the off-policy IRL algorithm is pre-sented.Based on the online off-policy IRL method,a computa-tional intelligence interception guidance(CIIG)law is developed for intercepting high-maneuvering target.As a model-free method,intercepting targets can be achieved through measur-ing system data online.The effectiveness of the CIIG is verified through two missile and target engagement scenarios.展开更多
The libration control problem of space tether system(STS)for post-capture of payload is studied.The process of payload capture will cause tether swing and deviation from the nominal position,resulting in the failure o...The libration control problem of space tether system(STS)for post-capture of payload is studied.The process of payload capture will cause tether swing and deviation from the nominal position,resulting in the failure of capture mission.Due to unknown inertial parameters after capturing the payload,an adaptive optimal control based on policy iteration is developed to stabilize the uncertain dynamic system in the post-capture phase.By introducing integral reinforcement learning(IRL)scheme,the algebraic Riccati equation(ARE)can be online solved without known dynamics.To avoid computational burden from iteration equations,the online implementation of policy iteration algorithm is provided by the least-squares solution method.Finally,the effectiveness of the algorithm is validated by numerical simulations.展开更多
In order to help the operator perform the human-robot collaboration task and optimize the task performance,an adaptive control method based on optimal admittance parameters is proposed.The overall control structure wi...In order to help the operator perform the human-robot collaboration task and optimize the task performance,an adaptive control method based on optimal admittance parameters is proposed.The overall control structure with the inner loop and outer loop is first established.The tasks of the inner loop and outer loop are robot control and task optimization,respectively.An inner-loop robot controller integrated with barrier Lyapunov function and radial basis function neural networks is then proposed,which makes the robot with unknown dynamics securely behave like a prescribed robot admittance model sensed by the operator.Subsequently,the optimal parameters of the robot admittance model are obtained in the outer loop to minimize the task tracking error and interaction force.The optimization problem of the robot admittance model is transformed into a linear quadratic regulator problem by constructing the human-robot collaboration system model.The model includes the unknown dynamics of the operator and the task performance details.To relax the requirement of the system model,the integral reinforcement learning is employed to solve the linear quadratic regulator problem.Besides,an auxiliary force is designed to help the operator complete the specific task better.Compared with the traditional control scheme,the security performance and interaction performance of the human-robot collaboration system are improved.The effectiveness of the proposed method is verified through two numerical simulations.In addition,a practical human-robot collaboration experiment is carried out to demonstrate the performance of the proposed method.展开更多
文摘Missile interception problem can be regarded as a two-person zero-sum differential games problem,which depends on the solution of Hamilton-Jacobi-Isaacs(HJI)equa-tion.It has been proved impossible to obtain a closed-form solu-tion due to the nonlinearity of HJI equation,and many iterative algorithms are proposed to solve the HJI equation.Simultane-ous policy updating algorithm(SPUA)is an effective algorithm for solving HJI equation,but it is an on-policy integral reinforce-ment learning(IRL).For online implementation of SPUA,the dis-turbance signals need to be adjustable,which is unrealistic.In this paper,an off-policy IRL algorithm based on SPUA is pro-posed without making use of any knowledge of the systems dynamics.Then,a neural-network based online adaptive critic implementation scheme of the off-policy IRL algorithm is pre-sented.Based on the online off-policy IRL method,a computa-tional intelligence interception guidance(CIIG)law is developed for intercepting high-maneuvering target.As a model-free method,intercepting targets can be achieved through measur-ing system data online.The effectiveness of the CIIG is verified through two missile and target engagement scenarios.
基金supported by the National Natural Science Foundation of China(No.62111530051)the Fundamental Research Funds for the Central Universities(No.3102017JC06002)the Shaanxi Science and Technology Program,China(No.2017KW-ZD-04).
文摘The libration control problem of space tether system(STS)for post-capture of payload is studied.The process of payload capture will cause tether swing and deviation from the nominal position,resulting in the failure of capture mission.Due to unknown inertial parameters after capturing the payload,an adaptive optimal control based on policy iteration is developed to stabilize the uncertain dynamic system in the post-capture phase.By introducing integral reinforcement learning(IRL)scheme,the algebraic Riccati equation(ARE)can be online solved without known dynamics.To avoid computational burden from iteration equations,the online implementation of policy iteration algorithm is provided by the least-squares solution method.Finally,the effectiveness of the algorithm is validated by numerical simulations.
基金the National Key R&D Program of China(No.2018YFB1308400)the Natural Science Foundation of Zhejiang Province(No.LY21F030018)。
文摘In order to help the operator perform the human-robot collaboration task and optimize the task performance,an adaptive control method based on optimal admittance parameters is proposed.The overall control structure with the inner loop and outer loop is first established.The tasks of the inner loop and outer loop are robot control and task optimization,respectively.An inner-loop robot controller integrated with barrier Lyapunov function and radial basis function neural networks is then proposed,which makes the robot with unknown dynamics securely behave like a prescribed robot admittance model sensed by the operator.Subsequently,the optimal parameters of the robot admittance model are obtained in the outer loop to minimize the task tracking error and interaction force.The optimization problem of the robot admittance model is transformed into a linear quadratic regulator problem by constructing the human-robot collaboration system model.The model includes the unknown dynamics of the operator and the task performance details.To relax the requirement of the system model,the integral reinforcement learning is employed to solve the linear quadratic regulator problem.Besides,an auxiliary force is designed to help the operator complete the specific task better.Compared with the traditional control scheme,the security performance and interaction performance of the human-robot collaboration system are improved.The effectiveness of the proposed method is verified through two numerical simulations.In addition,a practical human-robot collaboration experiment is carried out to demonstrate the performance of the proposed method.