5Jin Xiao,IEEE Conference on Evolutionary Computation,1996年,366页
6Maaref H, Barret C. Sensor-Based Navigation of a Mobile Robot in an Indoor Environment. Robotics and Autonomous Systems, 2002, 38:1-18.
7Hauskrecht M, Meuleau N, Boutilier C, etal. Hierarchical Solution of Markov Decision Processes Using Macro-Actions. In:Cooper G F, Moral S, eds. Proc of the 14th Conference on Uncertainty in Artificial Intelligence. San Francisco, USA: Morgan Kaufmann, 1998, 220-229.
8Wiering M, Schmidhuber J. HQ-Learning. Adaptive Behavior,1997, 6(2): 219-246.
9Barto A G, Mahanevan S. Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Dynamic Systems: Theory and Applications, 2003, 13~ 41-77.
10Sutton R, Precup D, Singh S. Between MDPs and Semi-MDPs:A Framework for Temporal Abstraction in Reinforcement Learning. Artificial Intelligence, 1999, 112:181-211.