摘要
针对卫星通信系统中的任务调度问题,基于深度强化学习框架提出了一种多分支深度Q网络模型的卫星通信任务调度方法。通过引入任务列表分支网络和资源池分支网络,该模型能够同时提取卫星任务状态和卫星资源池状态的特征,并通过价值分支网络计算动作价值函数;在模型输出部分引入了包括任务选择与资源优先级动作的多个动作的选择,增加了调度动作的选择空间。实验结果表明,在非零浪费和零浪费数据集上,多分支深度Q网络模型与启发式方法相比在提高平均资源占用性能的同时显著降低了运行的时间开销。
To solve the problem of satellite task scheduling in satellite communication systems,a multi-branch deep Q network model based on reinforcement learning is proposed.This network model uses a task list branch network and a resource pool branch network to simultaneously extract the characteristics of satellite task status and satellite resource pool status,and uses a value branch network to calculate the action value function.At the same time,the selection of multiple actions including task selection and resource priority actions is introduced.The experimental results show that both on non-zero waste and zero-waste datasets,the multi-branch deep Q network model significantly reduces runtime overhead while improving average resource occupancy performance compared to heuristic methods.
作者
班亚明
马宁
王玉清
孙文宇
王宝宝
刘秀芳
贾慧燕
BAN Yaming;MA Ning;WANG Yuqing;SUN Wenyu;WANG Baobao;LIU Xiufang;JIA Huiyan(Academy for Network&Commuuication of CETC,Shijiazhuang 050081,China;Shijiazhuang Military Representative Office of the Military Representative Bureau of the EquipmentDepartment of Aerospace System Center,Shijiazhuang 050081,China)
出处
《无线电工程》
北大核心
2023年第12期2921-2926,共6页
Radio Engineering
关键词
任务调度
深度Q网络
深度强化学习
卫星通信
task scheduling
deep Q Network
deep reinforcement learning
satellite communication