SWIPT-D2D通信中基于深度强化学习的资源分配

Resource Allocation Based on Deep Reinforcement Learning in SWIPT-D2D Communication

下载PDF

导出

摘要针对信道状态信息未知SWIPT-D2D(Simultaneous Wireless Information and Power Transfer Device to Device)无线通信网络环境下设备间信号干扰以及设备能量损耗问题,提出通过使用近端策略优化(Proximal Policy Optimization,PPO)算法,在满足蜂窝用户通信质量要求的前提下同时对D2D用户的资源块、发射功率以及功率分割比三部分进行联合优化。仿真结果表明,所提算法相比于其他算法能够为D2D用户制定更好的资源分配方案,在保证蜂窝用户保持较高通信速率的同时使D2D用户获得更高的能效。同时,当环境中用户数量增加时,所提算法相比于Dueling Double DQN(Deep Q-Network)以及DQN算法,D2D能效分别平均提高了15.95%和23.59%,当通信网络规模变大时所提算法具有更强的鲁棒性。 To address the problems of inter-device signal interference and device energy loss in a channel state information-unknown simultaneous wireless information and power transfer-device-to-device(SWIPT-D2D)wireless communication network environment,the authors propose to use the Proximal Policy Optimization(PPO)algorithm to satisfy the communication quality requirements of cellular users,while the resource block,transmit power,and power split ratio of D2D users are simultaneously reduced.The proposed algorithm jointly optimizes the resource block,transmit power and power split ratio of D2D users while satisfying the communication quality requirements of cellular users.Simulation results show that the proposed algorithm can develop a better resource allocation scheme for D2D users than other algorithms,which can ensure a higher communication rate for cellular users while achieving higher energy efficiency for D2D users.Furthermore,when the number of users in the environment increases,the proposed algorithm improves the D2D energy efficiency by 15.95%and 23.59%on average compared with the Dueling Double DQN(Deep Q-Network)and DQN algorithms,respectively,and the algorithm is more robust when the communication network size becomes larger.

作者刘兴鑫李君李正权 LIU Xingxin;LI Jun;LI Zhengquan(School of Electronics and Information Engineering,Nanjing University of Information Science and Technology,Nanjing 210044,China;School of Electronic Information Engineering,Wuxi University,Wuxi 214105,China;Key Laboratory of Advanced Process Control for Light Industry(Ministry of Education),Jiangnan University,Wuxi 214122,China;State Key Laboratory of Network and Switching Technology,Beijing University of Posts and Telecommunications,Beijing 100876,China)

机构地区南京信息工程大学电子与信息工程学院无锡学院电子信息工程学院江南大学轻工过程先进控制教育部重点实验室北京邮电大学网络与交换技术国家重点实验室

出处《电讯技术》北大核心 2024年第5期693-701,共9页 Telecommunication Engineering

基金未来网络科研基金项目(FNSRFP-2021-YB-11)。

关键词 SWIPT-D2D 资源分配深度强化学习联合优化 SWIPT-D2D resource allocation deep reinforcement learning joint optimization

分类号 TN929.5 [电子电信—通信与信息系统]

引文网络
相关文献

1Weiping SHI,Qingqing WU,Di WU,Feng SHU,Jiangzhou WANG.Joint Transmit and Reflective Beamforming Design for Active IRS-Aided SWIPT Systems[J].Chinese Journal of Electronics,2024,33(2):536-548.
2董恒,徐凯,宋荣方.基于SWIPT的无小区大规模MIMO-NOMA系统能量效率研究[J].南京邮电大学学报（自然科学版）,2024,44(2):11-18.
3郑章财,徐锋.嵌入式服务器软件接口通信容量调节算法仿真[J].计算机仿真,2024,41(4):265-269.
4Weihua Chen,Jingtao Jia,Xiaoheng Yan,Yuhang Song,Jiayi Li.Wireless Power Supply Based on MNG-MNZ Metamaterial for Cardiac Pacemakers[J].CES Transactions on Electrical Machines and Systems,2024,8(1):103-112.
5季薇,刘子卿.IRS辅助的异构SWIPT-NOMA系统资源分配方案[J].通信学报,2024,45(4):39-53.
6Padhmanabhaiyappan Sivalingam,Madhusudanan Gurusamy.Momentum Search Algorithm for Analysis of Fuel Cell Vehicle-to-Grid System with Large-scale Buildings[J].Protection and Control of Modern Power Systems,2024,9(2):147-160.

电讯技术

2024年第5期

浏览历史

内容加载中请稍等...

SWIPT-D2D通信中基于深度强化学习的资源分配

相关作者

相关机构

相关主题

浏览历史