Proximal policy optimization with an integral compensator for quadrotor control 被引量：3

导出

摘要 We use the advanced proximal policy optimization(PPO)reinforcement learning algorithm to optimize the stochastic control strategy to achieve speed control of the"model-free"quadrotor.The model is controlled by four learned neural networks,which directly map the system states to control commands in an end-to-end style.By introducing an integral compensator into the actor-critic framework,the speed tracking accuracy and robustness have been greatly enhanced.In addition,a two-phase learning scheme which includes both offline-and online-learning is developed for practical use.A model with strong generalization ability is learned in the offline phase.Then,the flight policy of the model is continuously optimized in the online learning phase.Finally,the performances of our proposed algorithm are compared with those of the traditional PID algorithm.

作者 Huan HU Qing-ling WANG

机构地区 School of Automation

出处《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2020年第5期777-795,共19页 信息与电子工程前沿（英文版）

基金 Project supported by the National Key R&D Program of China(No.2018AAA0101400) the National Natural Science Foundation of China(Nos.61973074,U1713209,61520106009,and 61533008) the Science and Technology on Information System Engineering Laboratory(No.05201902) the Fundamental Research Funds for the Central Universities,China。

关键词 Reinforcement learning Proximal policy optimization Quadrotor control Neural network

分类号 V275.1 [航空宇航科学与技术—飞行器设计] V249.1 [航空宇航科学与技术—飞行器设计]

引文网络
相关文献

同被引文献15

1郭妍,吴美平,唐康华,王雪莹.基于积分反步法的四旋翼飞行器控制设计[J].智能科学与技术学报,2019,0(2):133-139. 被引量：3
2王宏伦,杜熠,盖文东.无人机自动空中加油精确对接控制[J].北京航空航天大学学报,2011,37(7):822-826. 被引量：18
3纪超,王庆.基于双目视觉的自主空中加油算法研究与仿真[J].系统仿真学报,2013,25(6):1327-1331. 被引量：7
4李大伟,王宏伦,盖文东.基于L_1自适应的自动空中加油对接段飞行控制技术[J].控制理论与应用,2014,31(6):717-724. 被引量：5
5钱素娟,王水萍.基于辅助视觉飞机空中加油对接优化过程仿真[J].计算机仿真,2014,31(8):88-91. 被引量：1
6Wang Xufeng,Kong Xingwei,Zhi Jianhui,Chen Yong,Dong Xinmin.Real-time drogue recognition and 3D locating for UAV autonomous aerial refueling based on monocular machine vision[J].Chinese Journal of Aeronautics,2015,28(6):1667-1675. 被引量：15
7单尧,孙永荣,黄斌,李旺灵.自主空中加油飞行对接演示平台设计与实现[J].电子测量技术,2016,39(12):176-179. 被引量：4
8朱虎,袁锁中,申倩.基于L1动态逆的自主空中加油对接控制[J].兵工自动化,2018,37(1):19-23. 被引量：2
9刘爱超,佘浩平,杨钦宁,周思成.无人机空中对接中的视觉导航方法[J].导航定位与授时,2019,6(1):28-34. 被引量：7
10张堃,李珂,时昊天,张振冲,刘泽坤.基于深度强化学习的UAV航路自主引导机动控制决策算法[J].系统工程与电子技术,2020,42(7):1567-1574. 被引量：12

引证文献3

1贾振宇,刘子龙.一种通过强化学习的四旋翼姿态控制算法[J].小型微型计算机系统,2021,42(10):2074-2078. 被引量：4
2杨兴昊,宋建梅,佘浩平,吴程杰,杨钦宁,付伟达.基于深度强化学习的无人机空中目标自主跟踪[J].计算机测量与控制,2022,30(10):88-94. 被引量：2
3杨宗月,刘磊,刘晨.基于PPO算法的四旋翼无人机位置控制[J].计算机仿真,2024,41(2):462-467.

二级引证文献6

1李延儒,左铁东,王婧.基于DQN深度强化学习的无人机智能航路规划方法研究[J].电子技术与软件工程,2022(18):5-8. 被引量：1
2田欣然,邵星灵,张飞.基于非线性自抗扰的四旋翼姿态控制[J].无人系统技术,2022,5(6):86-93. 被引量：1
3戴宇轩,崔承刚.基于深度强化学习的Boost变换器控制策略[J].系统仿真学报,2023,35(5):1109-1119.
4于力涵,洪儒,吴宇伦,谢迎娟.基于IKGC-PSO算法的无人机三维路径规划系统[J].计算机测量与控制,2023,31(8):259-266.
5弋英民,王柯颖,苑易伟,薛向宏,李余兴,刘柏均,王烨琛.基于扩展卡尔曼滤波的固定翼无人机姿态解算方法[J].小型微型计算机系统,2023,44(11):2384-2391.
6李彦铃,罗飞舟,葛致磊.基于鲁棒观测器的深度强化学习垂直起降运载器姿态稳定研究[J].系统工程与电子技术,2024,46(3):1038-1047.

1边张琰霆.新媒介环境下高中生网络自主学习能力调查——以合肥六中为例[J].科技视界,2017(33):63-64. 被引量：1

Frontiers of Information Technology & Electronic Engineering

2020年第5期

浏览历史

内容加载中请稍等...

Proximal policy optimization with an integral compensator for quadrotor control 被引量：3

同被引文献15

引证文献3

二级引证文献6

相关作者

相关机构

相关主题

浏览历史