摘要
运用强化学习的方法来对连续的倒立摆系统实现平衡控制是一直以来有待解决的问题。该文将Q学习与神经网络中的BP网络、S激活函数相结合,利用神经网络的泛化性能,设计出一种新的学习控制策略,通过迭代和学习过程,不但能够解决倒立摆系统连续状态空间的输入问题,还成功解决了输出连续动作空间的问题。将此方法运用于连续倒立摆系统的平衡控制中,经过基于实际控制模型的Matlab软件仿真实验,结果显示了这个方法的可行性。该方法进一步提高了强化学习理论在实际控制系统中的应用价值。
How to balance a continuous inverted pendulum using reinforcement learning has been always a problem to be solved. This paper presents a new method combining Q - learning with BP network and sigmoid activation function, using neural network's generalization performance to solve not only the input of a continuous state space but also output as a continuous action space, which has been proved to be applicable by Matlab software simulation with real pendulum system model. This method enhanced the reinforcement learning's applicability in real control system.
出处
《计算机仿真》
CSCD
2006年第4期298-300,325,共4页
Computer Simulation
关键词
强化学习
神经网络
激活函数
泛化性能
连续动作空间
Reinforcement learning
Neural network
Activation function
Generalization performance
Continuous action space