期刊文献+

柔性机器人多层启发式动态规划平衡认知研究 被引量:1

Research on Balance Cognition Based on Multi-level Heuristic Dynamic Programming of Flexible Robot
下载PDF
导出
摘要 针对柔性自平衡机器人的稳定自平衡认知问题,提出一种基于多层启发式动态规划认知(Multi-level heuristic dynamic programming,Ml HDP)模型的平衡认知方法,将这种认知方法用于柔性自平衡机器人自平衡学习中。通过引入取向奖赏模块,把原有离散形式的奖赏机制转化为连续形式,以转化后的连续奖赏信号作为评价的主要依据。该方案使得机器人在自主认知的过程中能够记录更多信息量,提高其认知能力。通过机器人的自平衡认知实验可以看出,在机器人具有柔性关节的条件下仍然具备良好的认知能力,学习效果优于传统方法,鲁棒性强。 Aiming at the stable self-balancing cognition problems of flexible self-balancing robot, a balance cognition method based on multi-level heuristic dynamic programming is proposed and applied on the self-balance learning of flexible self-balancing robot in this paper In the proposed cognition method, the original reward mechanism with discrete form is transformed into a continuous form by introducing the orientational reward module, and the converted continuous reward signal is used as the major basis for evaluation. The scheme enables the robot to record more information in the autonomic cognition process and improve its cognitive ability. Through the robot self-balancing cognitive experiment, it can be seen that the robot can still be able to achieve good cognitive ability even the robot contains flexible joints. Its learning effect and robustness are better than traditional method.
作者 陈静
出处 《系统仿真学报》 CAS CSCD 北大核心 2018年第1期147-155,共9页 Journal of System Simulation
基金 国家自然科学基金青年基金(61403282) 天津市高等学校科技发展基金(20130807) 天津职业技术师范大学校级项目(KJY1311)
关键词 启发式动态规划 柔性自平衡机器人 认知模型 内部奖赏 HDP (heuristic dynamic programming) flexible self-balancing robot cognitive model internal reward
  • 相关文献

参考文献7

二级参考文献103

  • 1吕春峰,邵建龙,朱建平,韩东起.基于人工势场法机器人小车避障的研究[J].昆明理工大学学报(理工版),2005,30(z1):131-134. 被引量:2
  • 2程志江,李剑波.基于模糊控制的智能小车控制系统开发[J].计算机应用,2008,28(S2):350-353. 被引量:10
  • 3李健,陈涵,李大路.电力系统动态等值研究方法综述[J].广东电力,2007,20(2):1-4. 被引量:14
  • 4鞠平,王卫华,谢宏杰,周海强.3区域互联电力系统动态等值的辨识方法[J].中国电机工程学报,2007,27(13):29-34. 被引量:21
  • 5Bernstein D, Zilberstein S, Immerman N. The Complexity of Decentralized Control of Markov Decision Processes. In :Proc of the 16th Conference on Uncertainty in Artificial Intelligence. Stanford, USA, 2000, 32-37.
  • 6Singh S P, Jaakola T, Jordan M I. Reinforcement Learning with Soft State Aggregation. In:Tesauro G, Touretzky D S, Leen T K, eds. Advances in Neural Information Processing Systems 7.Cambridge, USA:MIT Press, 1995, 361-368.
  • 7Moriarty D, Sehultz A, Grefenstette J. Evolutionary Algorithms for Reinforcement Learning. Journal of Artificial Intelligence Research, 1999, 11:241-276.
  • 8Bertsekas D P, Tsitsiklis J N. Neuro-Dynamic Programming.Belmont, USA: Athena Scientific, 1996.
  • 9Barto A G, Mahadevan S. Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Dynamic Systems:Theory and Applications, 2003, 13(4), 41-77.
  • 10Sutton R S, Precup D, Singh S P. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. Artificial Intelligence, 1999, 112(1-2): 181-211.

共引文献36

同被引文献18

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部