期刊文献+

Proximal policy optimization with an integral compensator for quadrotor control 被引量:3

原文传递
导出
摘要 We use the advanced proximal policy optimization(PPO)reinforcement learning algorithm to optimize the stochastic control strategy to achieve speed control of the"model-free"quadrotor.The model is controlled by four learned neural networks,which directly map the system states to control commands in an end-to-end style.By introducing an integral compensator into the actor-critic framework,the speed tracking accuracy and robustness have been greatly enhanced.In addition,a two-phase learning scheme which includes both offline-and online-learning is developed for practical use.A model with strong generalization ability is learned in the offline phase.Then,the flight policy of the model is continuously optimized in the online learning phase.Finally,the performances of our proposed algorithm are compared with those of the traditional PID algorithm.
机构地区 School of Automation
出处 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2020年第5期777-795,共19页 信息与电子工程前沿(英文版)
基金 Project supported by the National Key R&D Program of China(No.2018AAA0101400) the National Natural Science Foundation of China(Nos.61973074,U1713209,61520106009,and 61533008) the Science and Technology on Information System Engineering Laboratory(No.05201902) the Fundamental Research Funds for the Central Universities,China。
  • 相关文献

同被引文献15

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部