摘要
Deep reinforcement learning is a focus research area in artificial intelligence. The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods. The principle of adaptive dynamic programming(ADP)is first presented instead of direct dynamic programming(DP),and the inherent relationship between ADP and deep reinforcement learning is developed. Next, analytics intelligence, as the necessary requirement, for the real reinforcement learning, is discussed. Finally, the principle of the parallel dynamic programming, which integrates dynamic programming and analytics intelligence, is presented as the future computational intelligence.
Deep reinforcement learning is a focus research area in artificial intelligence. The principle of optimality in dynamic programming is a key to the success of reinforcement learning methods. The principle of adaptive dynamic programming ADP is first presented instead of direct dynamic programming DP , and the inherent relationship between ADP and deep reinforcement learning is developed. Next, analytics intelligence, as the necessary requirement, for the real reinforcement learning, is discussed. Finally, the principle of the parallel dynamic programming, which integrates dynamic programming and analytics intelligence, is presented as the future computational intelligence. © 2014 Chinese Association of Automation.
基金
supported by National Natural Science Foundation of China(61533019,61374105,71232006,61233001,71402178)