期刊文献+

SWIPT-D2D通信中基于深度强化学习的资源分配

Resource Allocation Based on Deep Reinforcement Learning in SWIPT-D2D Communication
下载PDF
导出
摘要 针对信道状态信息未知SWIPT-D2D(Simultaneous Wireless Information and Power Transfer Device to Device)无线通信网络环境下设备间信号干扰以及设备能量损耗问题,提出通过使用近端策略优化(Proximal Policy Optimization,PPO)算法,在满足蜂窝用户通信质量要求的前提下同时对D2D用户的资源块、发射功率以及功率分割比三部分进行联合优化。仿真结果表明,所提算法相比于其他算法能够为D2D用户制定更好的资源分配方案,在保证蜂窝用户保持较高通信速率的同时使D2D用户获得更高的能效。同时,当环境中用户数量增加时,所提算法相比于Dueling Double DQN(Deep Q-Network)以及DQN算法,D2D能效分别平均提高了15.95%和23.59%,当通信网络规模变大时所提算法具有更强的鲁棒性。 To address the problems of inter-device signal interference and device energy loss in a channel state information-unknown simultaneous wireless information and power transfer-device-to-device(SWIPT-D2D)wireless communication network environment,the authors propose to use the Proximal Policy Optimization(PPO)algorithm to satisfy the communication quality requirements of cellular users,while the resource block,transmit power,and power split ratio of D2D users are simultaneously reduced.The proposed algorithm jointly optimizes the resource block,transmit power and power split ratio of D2D users while satisfying the communication quality requirements of cellular users.Simulation results show that the proposed algorithm can develop a better resource allocation scheme for D2D users than other algorithms,which can ensure a higher communication rate for cellular users while achieving higher energy efficiency for D2D users.Furthermore,when the number of users in the environment increases,the proposed algorithm improves the D2D energy efficiency by 15.95%and 23.59%on average compared with the Dueling Double DQN(Deep Q-Network)and DQN algorithms,respectively,and the algorithm is more robust when the communication network size becomes larger.
作者 刘兴鑫 李君 李正权 LIU Xingxin;LI Jun;LI Zhengquan(School of Electronics and Information Engineering,Nanjing University of Information Science and Technology,Nanjing 210044,China;School of Electronic Information Engineering,Wuxi University,Wuxi 214105,China;Key Laboratory of Advanced Process Control for Light Industry(Ministry of Education),Jiangnan University,Wuxi 214122,China;State Key Laboratory of Network and Switching Technology,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出处 《电讯技术》 北大核心 2024年第5期693-701,共9页 Telecommunication Engineering
基金 未来网络科研基金项目(FNSRFP-2021-YB-11)。
关键词 SWIPT-D2D 资源分配 深度强化学习 联合优化 SWIPT-D2D resource allocation deep reinforcement learning joint optimization
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部