An adaptive proportional–integral–derivative(PID)controller based on Q-learning algorithm is proposed to balance the cart–pole system in simulation environment.This controller was trained using Q-learning algorithm...An adaptive proportional–integral–derivative(PID)controller based on Q-learning algorithm is proposed to balance the cart–pole system in simulation environment.This controller was trained using Q-learning algorithm and implemented the learned Q-tables to change the gains of linear PID controllers according to the state of the system during the control process.The adaptive PID controller based on Q-learning algorithm was trained from a set of fixed initial positions and was able to balance the system starting from a series of initial positions that are different from the ones used in the training session,which achieved equivalent or even better performances in comparison with the conventional PID controller and the controller only uses Q-learning algorithm.This indicates the advantage of the adaptive PID controller based on Q-learning algorithm both in the generality of balancing the cart–pole system from a relatively wide range of initial positions and in the stabilisability of achieving smaller steady-state error.展开更多
文摘An adaptive proportional–integral–derivative(PID)controller based on Q-learning algorithm is proposed to balance the cart–pole system in simulation environment.This controller was trained using Q-learning algorithm and implemented the learned Q-tables to change the gains of linear PID controllers according to the state of the system during the control process.The adaptive PID controller based on Q-learning algorithm was trained from a set of fixed initial positions and was able to balance the system starting from a series of initial positions that are different from the ones used in the training session,which achieved equivalent or even better performances in comparison with the conventional PID controller and the controller only uses Q-learning algorithm.This indicates the advantage of the adaptive PID controller based on Q-learning algorithm both in the generality of balancing the cart–pole system from a relatively wide range of initial positions and in the stabilisability of achieving smaller steady-state error.