Dear Editor,In this letter,the multi-objective optimal control problem of nonlinear discrete-time systems is investigated.A data-driven policy gradient algorithm is proposed in which the action-state value function is...Dear Editor,In this letter,the multi-objective optimal control problem of nonlinear discrete-time systems is investigated.A data-driven policy gradient algorithm is proposed in which the action-state value function is used to evaluate the policy.In the policy improvement process,the policy gradient based method is employed.展开更多
基金the National Natural Science Foundation of China(61922063,62273255,62150026)in part by the Shanghai International Science and Technology Cooperation Project(21550760900,22510712000)+1 种基金the Shanghai Municipal Science and Technology Major Project(2021SHZDZX0100)the Fundamental Research Funds for the Central Universities。
文摘Dear Editor,In this letter,the multi-objective optimal control problem of nonlinear discrete-time systems is investigated.A data-driven policy gradient algorithm is proposed in which the action-state value function is used to evaluate the policy.In the policy improvement process,the policy gradient based method is employed.