With affordable overhead on information exchange,energy-efficient beamforming has potential to achieve both low power consumption and high spectral efficiency.This paper formulates the problem of joint beamforming and...With affordable overhead on information exchange,energy-efficient beamforming has potential to achieve both low power consumption and high spectral efficiency.This paper formulates the problem of joint beamforming and power allocation for a multiple-input single-output(MISO)multi-cell network with local observations by taking the energy efficiency into account.To reduce the complexity of joint processing of received signals in presence of a large number of base station(BS),a new distributed framework is proposed for beamforming with multi-cell cooperation or competition.The optimization problem is modeled as a partially observable Markov decision process(POMDP)and is solved by a distributed multi-agent self-decision beamforming(DMAB)algorithm based on the distributed deep recurrent Q-network(D2RQN).Furthermore,limited-information exchange scheme is designed for the inter-cell cooperation to boost the global performance.The proposed learning architecture,with considerably less information exchange,is effective and scalable for a high-dimensional problem with increasing BSs.Also,the proposed DMAB algorithms outperform distributed deep Q-network(DQN)based methods and non-learning based methods with significant performance improvement.展开更多
基金Fundamental Research Funds for the Central Universities(ZYGX2020ZB042)。
文摘With affordable overhead on information exchange,energy-efficient beamforming has potential to achieve both low power consumption and high spectral efficiency.This paper formulates the problem of joint beamforming and power allocation for a multiple-input single-output(MISO)multi-cell network with local observations by taking the energy efficiency into account.To reduce the complexity of joint processing of received signals in presence of a large number of base station(BS),a new distributed framework is proposed for beamforming with multi-cell cooperation or competition.The optimization problem is modeled as a partially observable Markov decision process(POMDP)and is solved by a distributed multi-agent self-decision beamforming(DMAB)algorithm based on the distributed deep recurrent Q-network(D2RQN).Furthermore,limited-information exchange scheme is designed for the inter-cell cooperation to boost the global performance.The proposed learning architecture,with considerably less information exchange,is effective and scalable for a high-dimensional problem with increasing BSs.Also,the proposed DMAB algorithms outperform distributed deep Q-network(DQN)based methods and non-learning based methods with significant performance improvement.