Multi-UAV Trajectory Design and Power Control Based on Deep Reinforcement Learning 被引量：1

导出

摘要 In this paper,multi-unmanned aerial vehicle(multi-UAV)and multi-user system are studied,where UAVs are served as aerial base stations(BS)for ground users in the same frequency band without knowing the locations and channel parameters for the users.We aim to maximize the total throughput for all the users and meet the fairness requirement by optimizing the UAVs’trajectories and transmission power in a centralized way.This problem is non-convex and very difficult to solve,as the locations of the user are unknown to the UAVs.We propose a deep reinforcement learning(DRL)-based solution,i.e.,soft actor-critic(SAC)to address it via modeling the problem as a Markov decision process(MDP).We carefully design the reward function that combines sparse with non-sparse reward to achieve the balance between exploitation and exploration.The simulation results show that the proposed SAC has a very good performance in terms of both training and testing.

作者 Chiya Zhang Shiyuan Liang Chunlong He Kezhi Wang

机构地区 School of Electronic and Information Engineering National Mobile Communications Research Laboratory School of Computer Sciences and Electrical Engineering.Northumbria University

出处《Journal of Communications and Information Networks》 EI CSCD 2022年第2期192-201,共10页 通信与信息网络学报（英文）

基金 National Nat-ural Science Foundation of China(62101161) Shenzhen Basic Research Program(20200811192821001) Shenzhen Basic Research Program(JCYJ20190808122409660) Guangdong Basic Research Program(2019A1515110358) Guangdong Basic Research Program(2021A1515012097) Guangdong Basic Research Program(2020ZDZX1037) Guangdong Basic Research Program(2020ZDZX1021) open research fund of National Mobile Communications Research Laboratory,Southeast University(2021D16) open research fund of National Mobile Communications Research Laboratory,Southeast University(2022D02)。

关键词 multi-UAV and multi-user wireless system UAV POWERCONTROL trajectorydesign throughputmaxi-mization SAC

分类号 V279 [航空宇航科学与技术—飞行器设计] TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

同被引文献6

1谢添,高士顺,赵海涛,林沂,熊俊.基于强化学习的定向无线通信网络抗干扰资源调度算法[J].电波科学学报,2020,35(4):531-541. 被引量：24
2Wenjun Xu,Huangchun Lei,Jin Shang.Joint Topology Construction and Power Adjustment for UAV Networks:A Deep Reinforcement Learning Based Approach[J].China Communications,2021,18(7):265-283. 被引量：2
3HU Jinqiang,WU Husheng,ZHAN Renjun,MENASSEL Rafik,ZHOU Xuanwu.Self-organized search-attack mission planning for UAV swarm based on wolf pack hunting behavior[J].Journal of Systems Engineering and Electronics,2021,32(6):1463-1476. 被引量：14
4徐文染,陈燚涛.基于Friis传输公式的RSSI测距模型研究[J].武汉纺织大学学报,2022,35(4):38-42. 被引量：2
5潘筱茜,张姣,刘琰,王杉,陈海涛,赵海涛,魏急波.基于深度强化学习的多域联合干扰规避[J].信号处理,2022,38(12):2572-2581. 被引量：3
6赵国锋,卢奕杉,徐川,邢媛,何熊文,崔钊婧.面向航天器有线无线混合场景的流调度机制研究[J].电子与信息学报,2023,45(2):464-471. 被引量：1

引证文献1

1梁仕杰,赵海涛,张姣,王海军,魏急波,王俊芳.基于DRL的定向网络时隙复用和功率控制协议[J].信号处理,2024,40(7):1341-1353.

1Shiyang Zhou,Yufan Cheng,Xia Lei,Huanhuan Duan.Multi-Agent Few-Shot Meta Reinforcement Learning for Trajectory Design and Channel Selection in UAV-Assisted Networks[J].China Communications,2022,19(4):166-176. 被引量：1
2Dan Wang,Bo Liu,Hongjie Jia,Ziyang Zhang,Jingcheng Chen,Deyu Huang.Peer-to-peer Electricity Transaction Decisions of the User-side Smart Energy System Based on the SARSA Reinforcement Learning[J].CSEE Journal of Power and Energy Systems,2022,8(3):826-837. 被引量：1
315款青少年教育机器人比较试验,哪款性能最强,哪款表现较弱[J].中国消费者,2022(5):13-15.
4于啸(编译).DRL地球遥感成像光谱仪介绍(上)[J].红外,2020,41(6):42-48.
5于啸.DRL地球遥感成像光谱仪介绍(下)[J].红外,2020,41(10):48-48.
6Niraj Pathak,V. O. Thomas.Analysis of Effect of Oblateness of Smaller Primary on the Evolution of Periodic Orbits[J].International Journal of Astronomy and Astrophysics,2016,6(4):440-463.
7陈宇轩,王国强,罗贺,马滢滢.基于Actor-Critic算法的多无人机协同空战目标重分配方法[J].无线电工程,2022,52(7):1266-1275. 被引量：2
8Niraj Pathak,V. O. Thomas.Analysis of Effect of Solar Radiation Pressure of Bigger Primary on the Evolution of Periodic Orbits[J].International Journal of Astronomy and Astrophysics,2016,6(4):464-493.
9Tianyi Xiong,Zhiqiang Pu,Jianqiang Yi.Time-varying formation finite-time tracking control for multi-UAV systems under jointly connected topologies[J].International Journal of Intelligent Computing and Cybernetics,2017,10(4):478-490.
10王彦臻,胡晗,李文倩,袁士博,和望利.基于改进A2C目标驱动的室内无地图导航方法[J].控制工程,2022,29(3):474-479. 被引量：3

Journal of Communications and Information Networks

2022年第2期

浏览历史

内容加载中请稍等...

Multi-UAV Trajectory Design and Power Control Based on Deep Reinforcement Learning 被引量：1

同被引文献6

引证文献1

相关作者

相关机构

相关主题

浏览历史