摘要
为了提升无蜂窝毫米波大规模MIMO(Cell-Free millimeter-Wave massive MIMO,CF mmWave mMIMO)系统总能量效率,本文研究时变信道环境中接入点(Access Point,AP)睡眠节能机制.将AP开关切换(AP Switch ONOFF,ASO)策略看作一个马尔可夫决策过程,使用深度强化学习(Deep Reinforcement Learning,DRL)工具解决AP开关问题.引入干扰感知技术和局部敏感哈希检索方法减少代理与复杂环境的交互以及样本偏差,构造了一个新的效用函数,在严格用户服务质量(Quality of Service,QoS)约束下更好地权衡总能效和可达速率性能.通过对效用函数离散化分级处理,将状态空间映射为更小的分级状态空间,以加快决斗深度Q网络(Dueling Deep Q-Network,Dueling DQN)的收敛速度.仿真结果证明了该方案的稳定性、收敛性和严格QoS约束下的总能效性能优势.
To improve the global energy-efficiency(GEE)performance in cell-free millimeter-wave massive MIMO(CF mmWave mMIMO)systems,the access points(APs)sleep-mode techniques in dynamic time-varying channels are in⁃vestigated.The AP switch ON-OFF(ASO)strategy is formulated as a Markov decision process.Thus,a deep reinforce⁃ment learning(DRL)model can be used to solve the AP activation problem.The interference-aware method and the locali⁃ty-sensitive hashing method are introduced to reduce sample bias and interaction between agents and complex environ⁃ments.A novel cost function is constructed to achieve a better balance between GEE and achievable rate under the strict quality of service(QoS)constraints.In order to accelerate the convergence of the dueling deep Q-Network(DQN),the state space is mapped to the smaller hierarchical state space by discretizing the cost function.Simulation results have dem⁃onstrated the performance advantage of the convergence of deep reinforcement learning and GEE under the strict QoS con⁃straint.
作者
何云
申敏
王蕊
张梦
HE Yun;SHEN Min;WANG Rui;ZHANG Meng(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Innovation Team of Communication Core Chip,Protocols and System Application,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2023年第10期2831-2843,共13页
Acta Electronica Sinica
基金
国家科技重大专项基金(No.2018ZX03001026-002)。
关键词
无蜂窝
毫米波
深度强化学习
AP开关切换
能效
cell-free
millimeter-wave
deep reinforcement learning
access point switch on-off
energy-efficiency